According to the original vSphere feature list there is a new security feature called “VMkernel Protection” that uses a technology called Trusted Platform Module (TPM) to add a layer of protection to the VMkernel. The VMkernel (hypervisor) is the most critical component of a virtual host because if it is compromised the VM’s running on it can easily be compromised. Therefore VMware introduced a new protection mechanism in vSphere to ensure the integrity of the VMkernel both on disk and in memory. Here is how it is described by VMware:
VMkernel Protection – As part of ongoing efforts to protect the hypervisor from common attacks and exploits, mechanisms were introduced to assure the integrity of the VMkernel and loaded modules as they reside on disk and in memory. Disk-integrity techniques protect the boot-up of the hypervisor using the Trusted Platform Module (TPM), a hardware device embedded in servers. To ensure the authenticity and integrity of dynamically loaded code, VMkernel modules are digitally signed and validated during load-time. These disk integrity mechanisms protect against malware, which might attempt to overwrite or modify VMkernel as it persists on disk. VMkernel also uses memory integrity techniques at load-time coupled with microprocessor capabilities to protect itself from common buffer-overflow attacks that are used to exploit running code. These techniques create a stronger barrier of protection around the hypervisor. See the ESX Configuration Guide and the ESXi Configuration Guide.
Having a strong interest in security I was curious about this feature and wanted to try it out so I did some research on it. TPM is a security specification developed by Trusted Computing Group (TCG) that uses cryptographic keys to protect information. It relies on a TPM chip which has a unique RSA key burned into it and is capable of performing platform authentication and can be used to verify that software has not been changed. vSphere can use TPM to digitally sign VMkernel modules and validate them when the host is starting up to protect against malware that might overwrite them. This feature is similar to the Windows File Protection feature that Microsoft has built-in to Windows to prevent critical system files from being modified or overwritten.
TPM is integrated into processors and chipsets so just like every other technology Intel has their version of it and AMD their own. Intel’s is called Trusted Execution Technology (TXT) which has been available for some time and AMD’s is called Secure Execution Mode (AMD has very little information on this) and is not widely available. For TPM to work you must have both a CPU with the necessary processor extensions for TPM and a chipset that supports TPM. TPM uses Platform Configuration Registers (PCRs) that are like containers that can hold 160-bit values in them in the following manner:
- At boot PCRs are all initialized to a known value (either 0 or -1)
- An application can then measure things by computing its hash value
- The resulting measurement is inserted into a PCR, this process is called “extending the PCR”
- PCRs can be extended multiple times until a final value is calculated
- Each code segment is measured and validated and control passes from one code segment to the next
- PCRs represent an accumulated measurement of the history of the executed code beginning with power-up
- TPM signing keys can be used to sign the values of PCRs
- The system state can then be verified from the hashes that get stored into the PCRs
The technology behind TPM is a bit complex and if you wish to read more there are some great resources at the end of this post that you can check out. As I wanted to see this technology in action I ordered a TPM chip for one of our servers so I could try it out. The chips are fairly cheap, for HP servers they are about $39. They consist of a small little circuit board that plugs into a TPM slot located on the motherboard of the server.
There is also a pin that secures it so if it is ever removed you will know it has been tampered with.
Once the chip is inserted some new security options will appear in the server BIOS to configure the TPM chip as shown below.
Once I received the chip and put it in the server I turned to the vSphere documentation to set it up. The problem there was there was no documentation on how to do this despite it being advertised as a new vSphere security feature. The ESXi configuration guide had one little paragraph on TPM which didn’t tell how to set it up and use it:
This module is a hardware element that represents the core of trust for a platform and enables attestation of the boot process, as well as cryptographic key storage and protection. As part of the boot process, ESXi measures the VMkernel by the TPM, and changes to the VMkernel are logged from one boot to the next. Measurement values are propagated to vCenter Server, and can be retrieved by third-party agents using the vSphere API.
Frustrated I reached out to VMware to figure out how to use this feature, some of the information I was able to get is below:
- TPM is only supported with ESXi.
- You need a TCG compliant BIOS, TXT needs to be enabled from the BIOS. Once it is enabled, you need to enable use of tboot from the UI Advanced configuration option for the ESXi host (the host has to be added to VC to be able to do this).
- There are some logs in serial log which can be used to monitor TPM. A 3rd party VC API is provided to fetch the TPM PCRs. If TXT was successful, then VMkernel fingerprint is reported in PCR19 otherwise, if the host has TPM but TXT was not used, then it will show in PCR8, otherwise PCRs should be NULL.
- There might not be any production server platforms out there ‘today’ which can support TXT.
I never did find the “tboot” advanced parameter that was supposed to be enabled. I checked all through the VMkernel advanced settings and didn’t see anything that was even close. It seems like while TPM provides some additional great protection for the VMkernel it is not yet ready to be used. The building blocks are currently there in vSphere but none of the necessary support features to be able to use it effectively exist yet. For example there is no way to monitor the feature so even if you could enable it there would be not much value to it. I expect both 3rd party vendors and VMware will develop the missing pieces in a future release (note the ESX & ESXi 4.1/4.5 version #’s in the videos) and look forward to being able to fully utilize this new security feature.