Phala's Recap of TEE.salon Singapore: GPU TEE and more

The 3rd TEE.salon just happened! With contributions from more than 30 TEE accelerators, there were several interesting talks:

  • Glue and Coprocessor Architectures - Vitalik (EF)
  • Framework for trustless TEE chip design - Michael Gao (Fabric Cryptography)
  • Benchmarking and evaluating GPU-based TEEs - Hang Yin (Phala)
  • DCAP library v4: A unified interface to verify TEEs on-chain - Zheng Leong Chua (Automata)

Here, I’d like to briefly summarize the key takeaways from the Phala team’s perspective.

GPU TEE and Performance Evaluation

Besides CPU TEEs, there are now GPUs with full TEE capabilities. This enables trustless, data-intensive applications like AI training and inferencing.

Now there are two available GPU with TEE:

  • nVIDIA H100: 94GB VRAM and 3.9 TB/s bandwidth (general access in Q1 2024)
  • nVIDIA H200: 141GB VRAM and 4.8 TB/s bandwidth (early bird access in Q3 2024)

Due to its design, GPU TEE introduces some overhead:

  • GPU passthrough to a Confidential VM (CVM)
  • Driver in CVM sets up E2E encryption channel between the CVM and GPU
  • GPU generates Remote Attestation to bootstrap trust
  • Unmodified code can run in CVM to access the GPU
  • The encrypted IO is the main source of overhead

Benchmarks using vLLM show that:

  • Throughput overhead: less than 7% on TPS (tokens per second), less than 3% on QPS
  • Latency overhead: significant increase in Time to First Token (~20%) but less in Inter-Token Latency (less than 7%)
  • Overhead decreases with longer sequences or larger models

Conclusion: GPU TEEs provide a practically viable solution for verifiable and privacy-preserving computation with manageable overhead.

:point_right: Full Slides

Call to Actions

We are closer than ever to the mass adoption of TEE. The demand is clear, and the supply is emerging. Many pressing challenges are waiting for TEE-based solutions, and the next generation of TEEs promises significantly better performance and developer experience.

However, there is still much work to be done to turn this vision into reality. Here are our key calls to action:

  1. Need for Democratizing TEE

    • To build secure open source TEE, we need support from the Web3 community
    • Reality: 99% people in Web3 community still don’t understand TEE.
    • Why? Because building on TEE was hard and took months.
    • It’s possible to reduce the effort from months to minutes, thx to TDX & SEV-SNP!
    • With tools to democratize TEE, more developers will onboard, and the community will provide more support to build better TEE.
  2. Build a Decentralize Root-of-Trust

    • TEE relies on hardware Root-of-Trust to prove its authenticity. But HW RoT is bad:
      • a. When the RoT is compromised, the TEE hardware cannot recover.
      • b. TEE Apps is tightly locked to a physical TEE as they rely on HW RoT.
    • The RoT can be built with software, even a decentralized MPC network controlled by Ethereum. The benefit includes:
      • a. The RoT is no longer vulnerable to physical attacks.
      • b. TEE is abstracted and hot-swappable.
      • c. No vender lock-in, allowing apps to move freely between TEEs.
    • More details: Early Thoughts on Decentralized Root-of-Trust
    • We call on researchers and developers to explore and build decentralized RoT solutions.
  3. Build TEE Ecosystem Together

    • TEE is a very new and complex technology. Many pieces are yet to be figured out, including tooling, security research, hardware and software supply, etc.
    • We need those who are already building on TEE to join forces to accelerate its adoption.
    • We encourage those who are unfamiliar with TEE to engage with the TEE community to discover how TEE can help them build a better Web3.
6 Likes