Trust Platform Modules: Concepts and capabilities #

This document serves as an introduction to the capabilities of Trusted Platform Modules (TPM). We will focus our discussion on version 2.0 of the TPM specification released by the Trusted Computing Group. The emphasis will be on the TPM’s authentication system and its capabilities beyond key management.

Each of the subsequent sections will link relevant parts of the Trusted Computing Group’s specifications. The main specification is split into four parts:

Architecture,
Structures,
Commands, and
Supporting Routines.

Only the first three parts are relevant for this document. We will refer to specification Family 2.0, Level 00, Revision 01.38, released September 29, 2018. Additionally, we will refer to the PC Client Platform Firmware Profile specification Family 2.0, Level 00, Version 1.05, Revision 23, released May 7, 2021 as “Firmware” and the PC Client Platform TPM Profile Specification for TPM 2.0 version 1.05, Revision 14, released September 2, 2020 as “Platform”.

Authentication #

There are two methods of authentication when interacting with the TPM: knowledge of an AuthValue or satisfying a Policy (Part 1, Section 19.1).

AuthValues secrets that the caller must prove knowledge of. This can either be done by sending the AuthValue as part of the command or by computing the HMAC of a TPM-provided nonce using the AuthValue as the key (Part 1, Section 19.4 & 19.6). Hierarchies, Ordinary Objects (keys), and NV Indices all have AuthValues. Except for hierarchies, the AuthValue is provided during creation and cannot be changed later. The TPM2_HierarchyChangeAuth (Part 3, Section 24.8) command can be used to change the AuthValue of a hierarchy. To do so, the caller needs to authenticate itself within that hierarchy.

Policies, also known as Enhanced Authorization (Part 1, Section 19.7; Part 3, Section 23) offer an alternative way to authenticate actions. A Policy is a hash checksum (and some auxiliary state, Part 1, Section 19.7.8) that is set during the creation of an object in the TPM. To authenticate using a policy, the caller initiates a policy session, which is an object residing in the TPM and maintaining a Policy Digest. This digest is initialized with zeros and updated by invoking Policy* commands. During authentication, the PolicyDigest maintained by the session is compared to the policy set in the object. For a successful authentication, they must be identical. The policy commands augment the policy digest with TPM state or verify specific conditions. Consequently, access to an object can be regulated by TPM state.

For instance, when the PolicyPCR (Part 3, Section 23.7) command receives a policy session’s handle and a set of PCR indices, it updates the Policy Digest of the session in the following manner:

$$ \mathit{policyDigest} \leftarrow \mathrm{H}(\mathit{policyDigest} || \mathtt{0x17e} || \mathit{pcrs} || \mathit{digest}) $$

Where $H$ is the hash function chosen when starting the policy session, $pcrs$ are the selected PCR indices, and $digest$ is the hash of the contents of the PCR selected in $pcrs$. To ensure that a key can only be used when PCR[0] has a specific state, the policy digest generated by PolicyPCR can be computed in advance and set as the key’s policy. If the key is then created with the userWithAuth attribute set to false, the only way to use it is to call PolicyPCR on the policy session on a TPM with PCR[0] set to the desired value.

Policy sessions also allow limiting the precise action that can be authenticated. The PolicyCommandCode (Part 3, Section 23.11) command updates the policy digest with a given command code and limits the session to only be used to authenticate this particular command. The PolicyCpHash (Part 3, Section 23.13) command further limits the command to a particular invocation by requiring the on-wire serialization of the command to have a predefined checksum (Part 1, Section 18.7). Both commands can only be called once per policy session.

Some Policy commands allow replacing the session’s policy digest with a new value if a certain check passes. One of them is PolicyAuthorize (Part 3, Section 23.16). This command requires the caller to provide a signature over

$$ \mathrm{H}(\mathit{currentPolicyDigest} || \mathit{policyRef}) $$

where $\mathit{policyRef}$ is a freeform string used to allow the same signing key to be used for different object’s policies. Without it, every policy that contains a call to PolicyAuthorize may potentially accept every signature made by a single signing key. To prevent this, each object would have a different $\mathit{policyRef}$ string. Because the string is included in the signed message, two PolicyAuthorize invocations with different $\mathit{policyRef}$ strings, but the same signing keys, produce different policyDigests.

The signature is validated by a key loaded in the TPM by the caller. If it is valid, PolicyAuthorize replaces the current PolicyDigest with

$$ \mathit{policyDigest} \leftarrow \mathrm{H}(\mathtt{0x16a} || \mathrm{H}(\mathit{public}) || \mathit{policyRef}) $$

where $\mathit{public}$ is the public part of the TPM resident key. This allows for later updates to a policy. The owner of an object like an NV index would create a signing key with a secret AuthValue. The NV index’s policy can then be set to the PolicyDigest computed by PolicyAuthorize. The owner would then choose a public but unique $\mathit{policyRef}$ and sign $\mathrm{H}(\mathit{approvedPolicy} || \mathit{policyRef})$ where $\mathit{approvedPolicy}$ is the Policy the owner currently wants to use for the NV index. For each new policy, a new checksum would be signed by the NV index owner. PolicyAuthorize leaves all other state of the policy session untouched.

Example Policies #

We will develop a static and a updatable authentication policy for a TPM-resident RSA encryption key. The key should be usable for decryption by any user as long as the UEFI has not been manipulated. All other operations on the key should require additional authorization. We could set the key’s AuthValue to an empty buffer but this would make the key usable regardless of the UEFI state. Additionally, this would allow users to also encrypt data with it. The solution is to use a policy. When creating the key we’ll make sure the userWithAuth attribute is cleared to disable AuthValue-based authentication for TPM2_RSA_Decrypt. We set the policy to the SHA-256 digest $p$ that is computed as follows.

$$ \def \H{\operatorname*{H}} \def \cc{\:\Vert\:} \begin{equation} \begin{split} p_{\it init} &= 0^{\lvert\H\rvert} \\ p_{\it pcr} &= \H(p_{\it init} \cc \mathtt{0x17F} \cc \mathtt{0x01} \cc h \cc \mathtt{0x01} \cc \mathtt{0x01} \cc \H(\mathrm{PCR}[0])) \\ p &= \H(p_{pcr} \cc \mathtt{0x16C} \cc \mathtt{0x159}) \\ \end{split} \end{equation} $$

The function $\operatorname{H}$ is SHA-256, and $0^{\lvert\operatorname{H}\rvert}$ represents a string of 32 zero bytes. The variable $h$ is the TPM’s 16-bit algorithm identifier for SHA-256. It is important to note that SHA-256 can be replaced with any cryptographic hash function that is implemented by the TPM.

The equations above describe the steps that a user needs to take in order to create the correct policy digest and how to precompute the policy digest when creating the key.

First, the digest is initialized with zeros. Then, it is hashed with the serialized list of the selected PCR and the PCR values themselves. Finally, the intermediate digest is hashed with the authorized command’s identifier, which is 0x159 for TPM2_RSA_Decrypt. The resulting value, denoted as $p$, is set as the key’s policy digest at the time of creation.

Later, any user can start a policy session and call TPM2_PolicyPCR with the same PCR selection that was used before (PCR[0]). The TPM will then hash that selection, the current PCR value, and the policy session’s policy digest, which was initialized to all zeros. The resulting digest becomes the new policy session digest.

Next, the user calls TPM2_PolicyCommandCode with the command code of TPM2_RSA_Decrypt as an argument. Once again, the TPM will hash the policy digest with the command code and a constant. The resulting digest replaces the policy digest of the session. Additionally, the session’s commandCode field is set to the supplied command code of TPM2_RSA_Decrypt.

Now, the user can call TPM2_RSA_Decrypt using the policy session for authorization. The TPM will then verify that the policyDigest of the policy session matches the one set for the RSA key and that the commandCode field matches the current authorizing command. Because all inputs into the policy digest computation are public constants, the digest is simply a function of the value of PCR[0].

$$ p = f(\rm PCR [0]) $$

If PCR[0] has the value that was assumed during the computation p, the authorization passes. Other commands cannot be authorized because of the check for ‘commandCode’.

The downside of this static policy is that the accepted value of PCR[0] cannot be changed afterwards. The policy digest of a key (or any TPM object) cannot be updated. A new key would need to be generated. To make policies updatable, the TPM2_PolicyAuthorize command can be used. This associates a signing key with the policy. This key can sign updates to the policy as illustrated below.

$$ \def \H{\operatorname*{H}} \def \cc{\:\Vert\:} \begin{array}{lrcll} \bf {Setup} & \it policyRef &=& \text{“PCR-BOUND-RSA-KEY”} \\ & (\it pub, \it priv) &\leftarrow& \operatorname*{Gen}(1^n) \\ & \it sig &=& \operatorname*{Sign}(\it priv, \H(p_{\it pcr}^{\prime} \cc \it policyRef)) \\ & \quad \\ \bf{Authorization} & p_{\it init} &=& 0^{\lvert\H\rvert} & \\ & p_{\it pcr} &=& \H(p_{\it init} \cc \tt 0x17F \cc 0x01 \cc {\bf h} \cc 0x01 \cc 0x01 \cc \H(\rm PCR [0])) \\ & p_{\it reset} &=& 0^{\lvert\H\rvert} & \iff \operatorname*{Verify}(\it pub, \it sig, \H(p_{\it pcr} \cc \it policyRef)) \\ & p_{\it auth} &=& \H(p_{\it reset} \cc \tt 0x16A \cc \H(\it pub) \cc \it policyRef) & \\ & p &=& \H(p_{\it auth} \cc \tt 0x16C \cc 0x159) & \\ \end{array} $$

Using TPM2_PolicyAuthorize begins with a setup phase where a new signing key is generated, and the unique but public $policyRef$ value is chosen. The key is then used to sign the hash of the expected policy digest of the updatable part, together with the $policyRef$. The key does not have to be generated by the TPM, but its public part needs to be loadable by it, i.e., it must be a key type that the TPM implements.

The signature, public key, and $policyRef$ are then used in the authorization phase. First, a new policy session is created, which is initialized with an all-zero policy digest. Then, the party wanting to use the key calls TPM2_PolicyPCR. Like before, this results in a PCR[0]-dependent policy digest. Second, TPM2_PolicyAuthorize is called with the loaded public signing key, the signature, and the $policyRef$ values of the setup phase. The TPM will now verify that the signature is valid for the hash of the current policy digest concatenated to the $policyRef$. If yes, the TPM will reset the policy digest to all zeros and then hash it together with the loaded signing key’s public key and the $policyRef$ value. This will be used as the new policy digest of the session. From here, the policy session resumes as above with a call to TPM2_PolicyCommandCode and TPM2_RSA_Decrypt.

Observe that the policy digest computed by TPM2_PolicyAuthorize is constant and no longer dependent on the value of PCR[0]; only the signature over the previous policy digest is.

$$ p = \bf g \iff \operatorname*{Verify}(\it pub, \it sig, f’(\rm PCR [0])) $$

For each new, permitted value of PCR[0], a new signature can be issued without affecting any TPM state. This scheme splits the policy into two parts: a dynamic prefix that is authorized using a signing key, and a static suffix that depends on a constant, precomputable value.

Hierarchies #

All TPM-managed objects exist in one of the following four hierarchies (Part 1, Section 13):

Platform hierarchy, managed by the firmware.
Endorsement hierarchy, managed by the operating system.
Owner hierarchy, managed by the user.
Null hierarchy, an ephemeral hierarchy also managed by the user.

Additionally, there is a Lockout hierarchy that is only used as an authentication target.

Hierarchies have an AuthValue and Policy associated with them. Both are initialized to the empty buffer when the TPM starts.

While hierarchies are isolated from each other, meaning that one hierarchy cannot access the objects of another hierarchy, the Platform hierarchy is more powerful than the others. The Platform hierarchy, along with the Lockout hierarchy, is the only one that can clear the TPM (Part 3, Section 24.6) or change the PCR allocation (Part 3, Section 22.5). To prevent later code from having Platform rights, the UEFI changes the Platform hierarchy’s AuthValue to a random value (Firmware, Section 8.3.3). The operating system should also do the same for the Endorsement hierarchy. Furthermore, most actions authenticated by the Platform hierarchy also require Physical Presence. This is an assertion that cannot be raised by a TPM command, but only through an out-of-band mechanism implemented by the platform firmware. The exact implementation is not specified, but it should require pressing a button or performing a similar non-automatable action (Part 1, Section 19.3).

Platform Configuration Registers #

One of the important features of Trusted Computing Modules is recording all code and data consumed during boot. The TPM maintains a set of registers called PCR (Part 1, Section 17) that store a hash over some part of the boot process. PCR are numbered from zero upwards. Each PCR is assigned another part of the boot process, with higher numbers generally reserved for later stages. PCR contents are reset to all zero bits on each power cycle, and PCR 17 and up can also be reset during TPM operation (Platform, Table 6).

During boot, the host CPU sends data, known as measurements, to the TPM’s PCR using either TPM2_PCR_Extend or TPM2_PCR_Event (Part 3, Sections 22.2 & 22.3). Generally, each boot stage measures the next stage before handing off execution. This way, malicious code in the boot flow can be detected. PCR cannot be written to directly; each new measurement is concatenated with the previous PCR contents and then hashed (Part 1, Section 17.2). The operation is called extension and looks like the following pseudo code.

$$ \mathit{PCR} \leftarrow \mathrm{H}(\mathit{PCR} || \mathit{data}) $$

Each PCR consists of multiple banks. Each bank is a different cryptographic hash algorithm implemented by the TPM. A PCR can have multiple banks active at the same time. Thus, the operation above is run multiple times for different $\mathrm{H}$ but the same $\mathit{data}$. The active banks can be configured using TPM2_PCR_Allocate (Part 3, Section 22.5), which requires authenticating using the Platform hierarchy.

Each PCR has an AuthValue and Policy hash associated with it, which can be changed (Part 3, Sections 22.6 & 22.7). In practice, the AuthValue is set to the empty buffer (Platform, Section 4.6), allowing everyone to extend into all PCRs.

Non-Volatile RAM #

The majority of TPM state is volatile and is reset after a power cycle. Keys need to be stored outside the TPM and loaded into it for each use, PCR, hierarchy AuthValues and Policies are reset each start. The only user accessible state that is maintained across reboots are NV indices (Part 1, Section 37). TPMs have at least 3 kB of non-volatile memory (Platform, Table 4) that can be partitioned using indices. Like keys, users can define as many indices as they want until all memory is used up. Each index has an authentication policy, an AuthValue, an Hierarchy and so on. When defining an index its type (Part 2, Section 13.2) has to be chosen from the following list:

Ordinary: The index occupies a constant amount of memory filled with arbitrary data.
Counter: The index is a 64 bit number that can only be read and incremented.
Bits: The index is a 64 bit number whose individual bits can be manipulated and queried.
Extend: The index is large enough for a digest of a user defined hash function and behaves like a PCR.

Depending on the type, only certain operations can be done on an NV index. Only ordinary indices can be overwritten with TPM2_NV_Write (Part 3, Section 31.7), only counters can be incremented with TPM2_NV_Increment (Part 3, Section 31.8), only bits type indices can have bits set with TPM2_NV_SetBits (Part 3, Section 31.10) and only extend type indices can be extended using TPM2_NV_Extend. All indices can be read using TPM2_NV_Read (Part 3, Section 31.13) and certified using TPM2_NV_Certify (Part 3, Section 31.16).

NV counters that do not have the TPMA_NV_ORDERLY (Part 3, Section 31.2) bit set are immediately written back to NVRAM after change. Indices that have the bit set are called Hybrid Indices. A Hybrid Index is kept in volatile memory and only written to NVRAM if the TPM is shutdown. In case of a power failure, the changes to the index are lost.

When creating an index, the caller defines how read and write access needs to be authenticated. The choices are:

the index’s AuthValue or Policy,
the Owner hierarchy’s AuthValue or Policy, or
the Platform hierarchy’s AuthValue or Policy.

The authentication source can be separately configured for read and write access.

NV indices can be locked. Each index can be individually read or write-locked (Part 3, Section 31.11 and 31.14), or globally write-locked (Part 3, Section 31.12).

In both cases, depending on the type of lock, read or write access will fail. Locking indices individually is done by sending the TPM2_NV_ReadLock or TPM2_NV_WriteLock command with a specific index’s handle. This will lock the individual index. The global write lock is engaged by sending the TPM2_GlobalWriteLock command, which will write lock all indices that have the TPMA_NV_GLOBALLOCK attribute set. This command works across hierarchies.

Locked NV indices can be configured to release the lock when the TPM is restarted (Part 1, Section 37.2.6.1) or remain locked.

Deleting an NV index can be done in two ways: either by deleting the individual index or by clearing the entire TPM. The former is done with the TPM2_UndefineSpace or TPM2_UndefineSpaceSpecial command (Part 3, Section 31.3 & 31.4) depending on whether the index requires an AuthValue or policy session for deletion. Clearing the TPM using the TPM2_Clear command (Part 3, Section 24.6) deletes all NV indices that were not created with the TPMA_NV_PLATFORM_CREATE attribute set. Setting this attribute requires authenticating with the Platform hierarchy. The clear command itself requires either Lockout or Platform hierarchy authentication.