Intel TDX Security and Side Channels

PradyumnaShome · July 6, 2024, 6:33pm

Trusted execution environments have emerged as a promising alternative to the use of Zero-Knowledge Proofs (ZKPs) and secure Multi-Party Computation (MPC) for secure, attested, remote computation. Intel TDX seems to offer a simpler and more supportive development environment for verifiable computing. In this article, I will take you through a walkthrough of a selection of security concerns with the technology, beginning with a orientation as to how it fits into a widely studied problem in academic computer security research.

Sandboxing

Sandboxing is a decades-long computer security problem that underpins all of computer systems security research today.

How do we run code in a confidential and integrous manner i.e. without revealing control and and data flow, and without being tampered with?
How do we run code without it compromising the reliability, security, and correctness of the rest of a system (runtime, operating system, VM, bare-metal server)?

First we had the process, an environment in which to run code with a virtual address space (barring side channels, other processes can’t see you, and you can’t see other processes). This was great, because now application A could no longer directly corrupt application B or bring down the system by itself (the OS can configure resource limits per process now). However, this level of sandbox is insufficient for secure remote computation, a scenario which allows for a malicious operating system or hardware.

Source: Mastering Linux Development

Process Table and Control Block from OS Notes by Xinyang Yu
Source: OS Notes by Xinyang Yu

Next, we had the enclave, a process with exclusive resources running on a physically isolated portion of a CPU. Enclaves come equipped with encrypted memory for confidentiality and restricted external access for integrity, and attestation, a verifiable proof of integrity. However, this requires partitioning one’s application into an untrusted (not needing the sandbox) and trusted components (needing the sandbox), and defining an interface between the two. Why is this bad? Trusted code is run an extremely restricted environment with no system calls (the OS controls their implementation of course). Clearly, we need something better. Why can’t we simply throw an entire unmodified application into an “enclave” environment?

Source: SecureStreams: A Reactive Middleware Framework for Secure Data Stream Processing

Here is where Intel’s Trust Domain eXtensions (TDX for short) comes in: the virtual machine-equivalent of an enclave.

Intel TDX is a trusted execution environment , allowing one to run virtual machines which need confidentiality and integrity (but not availability!) guarantees. Confidential VMs, known as Trust Domains (TDs), are isolated in software and hardware, with transitions managed by the TDX Module, which is the trusted component of the otherwise untrusted hypervisor (just like enclaves assume a potentially malicious OS).

Source: Intel

Intel TDX Architectural Overview

Source: Intel TDX Security Research

Confidentiality in Intel TDX

Data

TD Memory is encrypted using Intel Multi-Key Total Memory Encryption – each trust domain can use multiple keys to encrypt pages.

Secret memory is managed by secure Extended Page Tables (EPTs) which use hardware support that is included with Intel VT-X virtualization extensions to handle Guest Virtual Address to Host Physical Address translations.

Shared memory is handled directly by the untrusted hypervisor using a separate EPT.

Code

Code in each Trust Domain can only be executed when the processor is in a special CPU mode known as Secure Arbitration Mode (SEAM), that is handled by the TDX Module. TDX is capable of detecting interference to the CPU such as modifying existing instructions or injecting new instructions in private memory. The CPU state is maintained securely during transitions (handled by the TDX Module) and each TD’s execution context is stored in its own metadata storage area which is encrypted using a TD-specific key.

Integrity in Intel TDX

TDs can also respond to local and remote attestation requests, which are handled by an SGX quoting enclave.

Memory accesses also involve two possible integrity verification measures including Logical Integrity (private memory is tracked using the TD Owner bit) and Cryptographic Integrity (which also computes and validates a MAC of data to detect unauthorized change).

However, rollback attacks (where one state of memory is recorded, and is used to replace a future state) do not seem to mitigated against. Similarly, rowhammer cannot be prevented outright, since it is circuit-level phenomenon beyond the scope TDX.

TDX Attacks

Ahoi Attacks – Heckler

The Ahoi attacks are a pair of vulnerabilities (WeSee and Heckler) related to interrupt-handling in AMD SEV-SNP (AMD’s VM-based TEE offering) and Intel TDX. The hypervisor must hook on interrupts in the interrupt controller to be able to forward them to the right TD. This allows each guest OS to run the interrupt service routine for the corresponding interrupt. However, this allows the hypervisor to inject interrupts willy-nilly.

In Heckler, a malicious TDX hypervisor inserts interrupts at carefully chosen times to cause changes to data and control flow in guest applications such as OpenSSH and sudo, unbeknownst to the application or guest operating system. In one example, if the hypervisor sent int 0x80 which is the interrupt used to run a system call in x86 processors, it could change the value in the rax register. For the sshd binary, when this was done on a user connection, changing the value of rax allowed the authentication process to be bypassed, allowing a malicious hypervisor to connect to a TD, breaking the sandbox around the trusted software.

Malicious interrupts were a cornerstone of the SGX attack landscape, and it looks like they are here to stay with VM-based TEEs.

TDX Bomb

While there are no availability guarantees provided by TDX, interruptions to TD execution is something that could disrupt the correct functioning of a distributed protocol. SGX-Bomb is a rowhammer-based fault attack that flips physical bits in DRAM that back SGX memory. This affects the integrity of the enclave causing it to abort and necessitating a restart.

Now, TDX uses SGX for attestation, which could be one vector. Furthermore, overwriting TD private memory causes an abort if the TD Owner bit is set to 0 on the next read. If the only goal is to cause the TD to be destroyed on a frequent enough basis, one could identify frequently called functions or frequently accessed data structures to continuously flip to prevent the TD from being able to complete execution. In fact, the TDX Module could be an independently interesting region of memory to cause flips in for more severe security violations.

Side Channels

Side-channel attacks are a class of vulnerabilities that involve the unintended disclosure of secret data via side-effects or metadata of computation, including the execution time, power consumption, processor frequency, and temperature changes, that might vary as a function of said data. Intel SGX in particular, has been notorious in the security community for dozens of attacks in since its release in 2015. A lot of these have been mitigated in the design of TDX, but since I’m writing this article in 2024, the journey is nowhere near done.

Here is quick example known as simple power analysis. In the below two figures, we’re attacking an algorithm called left-to-right square-and-multiply which is a routine used in the RSA algorithm to perform modular exponentiation. Here, the secret is the private exponent E. For every bit in E, we always square the intermediate result R_0, but if the bit is 1 we do an additionally multiply. Thus, iterations with e_i = 1 will consume more power, which can be measured using an oscilloscope.

In contrast, software-based microarchitectural side-channels, such as the ones I will present below, do not require specialized equipment and use measurements such as timing that can be obtained purely in software.

Square-and-Multiply Modular Exponentation
Source: Efficient Randomized Regular Modular Exponentiation using Combined Montgomery and Barrett Multiplications

Power Trace of Square-and-Multiply

Source: Power Side-Channel Attack Analysis: A Review of 20 Years of Study for the Layman

Non-TDX Specific

The vulnerabilities below broadly apply to a wide range of CPUs. They rely on CPU microarchitecture that any code executed relies on. There are excellent resources on understanding these issues already and I will not be diving deep in any of these for this article.

Spectre
Meltdown
Cache Side-Channel Attacks
Microarchitectural Data Sampling
Differential Power Analysis
Differential Fault Analysis
Power/Frequency/Thermal Side-Channel Attacks
and more!

TDX Specific

Hypervisor Guest Page Blocking

Intel TDX allows the hypervisor/VMM to block and unblock Guest Physical Address ranges using TDH.MEM.RANGE.BLOCK / UNBLOCK. This allows the VMM to receive EPT faults when code within the Trust Domain reads from pages that are mapped to the blocked range. This feature presumably exists to help with live VM migrations (reallocation of physical resources to one or more VMs) and for the hypervisor to reclaim memory when it is under memory pressure from other TDs.

Having said that, one can employ this command to construct a timing oracle to observe control flow in a fine-grained manner. Consider an if statement in a program where each branch falls on a different page. If the branch direction is determined by a secret bit b, by blocking one of the pages and observing execution, the hypervisor can determine the value of a secret.

This is extremely similar to Controlled Channel Attacks, discovered by researchers from Microsoft Research in 2015. However, several things are different:

The VMM operates in a different address space.
The VMM cannot influence page tables like an OS can for an enclave process. Private memory in a Trust Domain by the TD Guest OS, with the second level address translation handled by the TDX module’s secure EPT.

To mitigate this, one would need to distinguish between benign and malicious guest physical address blocks, which is non-trivial. If we relax the problem constraints by only allowing one TD and ensuring the TD is run on an otherwise quiet system with minimal interrupts, this attack vector may be easier to block.

TD Memory Overwrite by VMM

The hypervisor can overwrite memory within the Trust Domain. However, if TD memory is written to outside of SEAM mode, the TD Owner bit that is present in each cache line is set to 0 (newer versions are likely to use more bits to distinguish Trust Domains, based on personal communication with Intel employees). This bit is set to 1 when in a special CPU mode known as Secure Arbitration Mode (SEAM).

On the next read by the TD, this mismatch in bits is detected by the CPU, which leads to a TD abort (the VM is destroyed). In a similar manner as above, the hypervisor can strategically overwrite memory in one side of a secret-dependent branch to learn the secret predicate. In this case, our oracle is destructive, since the TD’s state is destroyed and it must be restarted. A possible mitigation could involve detecting an abnormal number of restarts.

Ciphertext Side-Channel Attacks

TEEs use memory encryption to prevent users in other execution contexts from learning sensitive data, in addition to the physical isolation used to prevent access. However, AMD SEV, a VM-based TEE solution allows hypervisors access to the VM State Save Area on VM Exits. The VM save area is akin to the process control block for a UNIX process, that stores information about the execution context of the VM, including the program counter, register file, and interrupt and exception information.

While this is encrypted in memory, this is unencrypted in the CPU, it can be viewed on state exits. Furthermore, since randomness is expensive, AMD uses deterministic encryption scheme, which was broken in an attack called CipherLeaks. The below image is a simplified explanation of one case of the vulnerability, from a mitigation known as CipherFix. Essentially, when performing a swap of two pages x and y, the ciphertext is preserved when they are not swapped, revealing secret bit b.

Source: Slides from CipherFix

In TDX, TD memory is stored in plaintext in the CPU. When the hypervisor attempts to read from TD secret memory, it receives all zeroes, since the CPU is not executing in SEAM mode. However, since only a single bit is used to determine whether a page is secret or not (the TD Owner bit), it may be possible for a TD to obtain ciphertext for another TD.

This can be made harder for adversaries with the use of multiple keys by each TD, usage of message authentication codes for integrity, and oblivious RAM (AES XTS uses tweaking to adjust the ciphertext based on the sector number).

Final Thoughts

As the quote from the song Side Channels are Everywhere by A Few Mistakes Ago goes:

Side channels are everywhere; most of the time we’re unaware

What I presented was a selection of TDX-specific security vulnerabilities. For side-channels, there are various mitigations that can be applied from a software and system-level to reduce the attack surface, including using constant time implementations (particularly for cryptography) to reduce timing channels, enabling data operand independent timing bits (DOIT) to turn off leaky performance optimizations, masking secrets in memory to reduce the impact of ciphertext side-channel attacks, using attestations to prevent security-critical from running in an environment with simultaneous multithreading enabled, and using oblivious random access memory (ORAM) to access large data structures (particularly those located in shared memory).

Side-channel security is a dynamic, evolving, and active field of research. It is important to be on the lookout for and apply the latest security patches, microcode updates, and other mitigations to reduce the potential for information disclosure, particularly in a world where information is money.

How can this relaxation be justified - how can the hypervisor prove to relying party that only one TD is running? I think the point is there’s a tradeoff of false positives (honest sysadmin has a hard time keeping system up) / false negatives (attacker gets away with it), but if only one TD is supported false positives are no longer an issue so we can crank up the threshold.

This was a good intro as the scope suggested, but it leaves a cliff hanger for future work. Imo the problem is this leaves us with a hard time making decisions - which of these mitigations are already present in, for example, the Canonical TDX reference guest image? Which of the others can be skipped? Which of these make sense to apply a platform level, vs which should be passed to application developers to be aware of

PradyumnaShome · July 8, 2024, 10:50pm

Thank you for the thought-inspiring questions, Andrew.

Re: AMD SEV
Yes, there are a bunch of VM-based TEEs out there now, including ARM CCA and IBM PEF, beyond AMD SEV/SEV-SNP, and Intel TDX. The Intel TDX Demystified paper (linked in Further Reading) has a comparison of some of these peer technologies.

Re: SCONE/Enarx/Gramine
I was not aware of Enarx and SCONE. All three sound like complementary deployment platforms for apps wanting to use one of several TEEs. As of now, it is unclear what the security level differences are for VM vs Process-based TEEs outside of the developer experience, which it seems like these software frameworks already seem to bridge. This could be an avenue for future research and probably depends heavily on specific TEE implementations. Great question though!

In particular, I do know that Intel TDX does come with several side-channel mitigations based on lessons learned from the last decade of Intel SGX vulnerabilities. E.g. mitigations for timer interrupt-based / replay attacks like SGX-Step/Microscope

This depends on which integrity mode is used. Logical integrity (LI) is weaker and just uses the TD Owner bit, whereas Cryptographic integrity (CI) also involves the memory controller recomputing and verifying MACs for loads, that are stored in ECC memory. Since these mitigations are applied in the memory controller, it will be able to detect writes from the hypervisor as being executed in the untrusted (non-SEAM mode) and clear the TD Owner bit in ECC memory. This can be surmounted using Rowhammer. Furthermore, the hypervisor cannot directly obtain ciphertext since reads return all 0s. This might work with a malicious TD that is also running in SEAM mode (similar to the setting I described for the ciphertext side-channel attack).

From my current understanding, I believe replay attacks would need to be done using physical access (to get around memory controller-level protections and ECC memory).

x86_64 processors use the SYSCALL instruction to run system calls, whereas x86_32 processors use interrupts – this is a special case which bridges these two worlds. The fix in the kernel was to disable 32-bit emulation mode by default, which can be enabled with a flag if needed. One would want to check if the kernel used in Yocto has this mode disabled, which would restrict TDX Guests to run only 64-bit OSes and applications.

If I understand correctly, you’re asking how we can technically enforce this. I don’t know the answer off the top of my head. The place to control this would be in the TDX Module, which is responsible for managing TDs and their private memory. Either enforcing using a specific version of a signed TDX module, or including TDX module configurations in attestation quotes in a high-integrity manner would be two ways to approach this. As of now, I believe only Intel can deploy changes to the (signed) TDX Module.

Mitigations with where it makes to integrate them

Constant time programming (application / library)
Data operand independent timing (DOIT) bits (application / library)
SMT off (already attestable since it is a BIOS setting that shows up in the quotes)
ORAM (application / system)
Secret masking (application would need an extra binary instrumentation step)

Mitigations 1-3 should always be done and are recommended for security-sensitive code by multiple vendors.

Mitigation 4 would be necessary on successful exploitation of the page blocking oracle, which would leak memory access patterns.

Mitigation 5 would be necessary on successful exploitation of ciphertext side-channels on TDX. This would have a significantly higher bar compared to AMD SEV-SNP and Intel SGX, given the difficulty of accessing ciphertext from a TD by another component (malicious TD/VMM), the ability of applications to use multiple encryption keys, and the fact that each TD has its own address space, EPT, and private HKID needed to access memory. A physical attack could undermine some of these protections, but further research is needed.