Dstack: speedrunning a p2p Confidential VM

Here’s a prototype of making P2P self-replicating confidential VMs (CVMs). Submitted for inclusion (or at least to influence) the T/acc movement. It is in the same cinematic universe as Sirrah, but oriented around CVMs instead.

The full code in the repository is meant to be short enough to read and understand in entirety (under 1000 lines).

You can play along using any linux environment with qemu. You do not need a real TDX machine since this supports TDX DCAP attestations from a dummy service.

Dstack Overview

Each Kettle node on the network is a CVM that runs a payload application container. For this testnet, there is an orchestration contract on Sepolia that manages the onboarding process for new kettles and tracks the currently-deployed version of the application. Here’s a high level illustration of the components that we’ll go over:

Replicatoor

When you first start your VM, it will try to join the network and get a copy of the shared key. This requires posting a Register message on Sepolia, and passing along a quote. The existing node checks your quote before giving you a copy of the key.

For the first node, we simply check the quote generated at “bootstrap” time into the source code repository. It becomes part of the security auditor’s scope of attention, just like smart contract constructor parameters.

Notice that all of the DCAP quote handling is kept off-chain, keeping gas costs very low.

This implementation is using real TDX DCAP attestations, but if you run the VM outside of TDX, it uses a dummy service to produce the attestation (a real CVM that produces a remote attestation for any report_data you ask for).

The replicatoor key is also used to provide EVM-friendly remote attestations, simply by ecdsa-signing a message.

Unstoppable TLS

The replicatoor key is used to derive a TLS private key, so all the nodes can serve HTTPS under the same certificate.

This approach supports both proactive checking (like RATLS and aTLS) as well as browser-friendly optimistic auditing with Certificate Transparency.

For proactive checking, simply inspect the public key for the certificate and check it has been signed by the replicatoor key. This suffices as evidence the session is with an enclave that has passed the onboarding flow.

Since we can’t make ordinary browsers do this proactive check, we can instead do our best with Certificate Transparency. This is where browsers accept certificates as long as they know they’re published on a transparency log. Auditors should check that every public key contained in a certificate appearing here has been signed b

KubernEthes

The smart contract keeps track of the current version of the app, including its sha256 hash.Here’s the explorer link to see the current app: Call “container()”

A monitor script on the CVM simply checks the blockchain for changes and reloads the application container, pulling from the network or untrusted host volume as needed.

Try it! The demo should looks like this

Discussion

This is meant to be a provocative prototype. Everything written in Python or Bash (100% of the repo) is meant to be discardable and replaceable with Rust. Eventually this will be a pull request to “Tstack” the production version SUAVE might use.

Here are some of the main takeaways:

  • permissionless onboarding with smart contracts doesn’t have to be hard
  • encourage moving complexity out of the TEE and into the untrusted host where it’s more manageable
  • the same certificates can be used for both ordinary (optimistic) TLS and proactive checking
  • encapsulating the raw DCAP remote attestations in the onboarding process means EVM-friendly attestations can be used everywhere else

Other things left to do:

  • DCAP verification is not yet in the CVM yet
  • Smart contract is missing on-chain PCCS / TCBInfo / MRTD.
  • We did not cover key rotation yet
  • Handling of keys and key derivation needs a look over
  • Still to integrate with a real TDX build
  • Reproducibility of the CVM using virt-customize is poor
  • The untrusted host service is not so robust yet
  • The on-chain container orchestration is pretty simplistic (just one app) but fun to think about further
13 Likes

Amazing prototype. We have talked with a lot of developers and I believe this is exactly the direction they want to see – an SDK that is:

  • Easy to use: allow to convert any Docker image to run inside a TEE CVM with minimal efforts
  • Secure by default: follow the best practices for TEE security
  • Not vendor lock-in: abstract the hardware / cloud provider details away from the developers

At the same time, after @socrates1024 shared his idea of dstack, we (the Phala team) have also played around the idea and built our own version of “dstack”: GitHub - Phala-Network/dstack.

The intention was to prototype it and eventually merge to “Tstack” as well. I’m still putting our ideas together as a forum post to share later.

Inline comments below:

Brilliant idea. So after the security audit, the auditor essentially establishes the trust from the code repo to the binary by its hash in the measurement.

Is the first quote still verified onchain? If not, do you mean to delegate the verification to the social consensus, i.e. let a DAO to check and vote for the inclusion?

Yeah, I’m for this approach. It fits into the framework of Early Thoughts on Decentralized Root-of-Trust.

Great point! I’ve put a lot of thoughts on the “Unstoppable TLS” idea. We call it “RA-HTTPS” internally, focusing on the browser compatibility and easier verification. The basic idea is by combining content addressing (e.g. 0xOrchAddr.pod.dstack.dev) and the access control of TLS certificate issuance. Passive defense by monitoring Certificate Transparency is a good way. We also explored some other proactive defense mechanism. I think we can share more ideas on this soon.

Agree. Another problem around it is about the boundary of trusted code and untrusted (user) code in the guest CVM. We also had a lot of thoughts on it. Hope we can publish some discussion soon.

It’s going to be hard. :joy: Especially the key rotation will put a computation burden to apps with a large database.

Re. KubernEthes (k9s😂): How does local test look like? I think we can have some kind of mock mode to run a standalone stack without accessing blockchain.

3 Likes

So happy we are speedrunning CVMs this month :slight_smile: Looking forward to your post!

It’s cool we came up with a similar approach for many parts so far, including key derivation from a replicating key manager, using that to unlock an encrypted disk, and fun ways to deploy container apps by hash.

I imagined running dstack-vm on top of the image already produced by canonical/tdx setup scripts without otherwise changing them. Since you customized the setup I’m looking forward to highlighting any differences there.

I didn’t include a quote providing lib at all (though I’m happy to have highlighted dummy attestation), but the dependency on ppa:kobuk-team/tdx-release stands out to me. It would be great to skip to something more self-contained, whether google-go-tdx or something else

For reverse proxy to the guest containers, I was happy nginx worked out, although maybe it adds weight to the image. I know Frieder reached for something custom too GitHub - konvera/cvm-reverse-proxy but our last conversation seems to favor separating authorization logic from reverse proxy.

Have to think more how to explain it, but I still think the analogy to smart contract auditing is the right viewpoint. If the bootstrap quote is generated at deployment time of new code, then inspecting it may as well be included in any process for auditing the code prior to marketing/distributing it.

This is the only part so far I want to discuss a different approach. The choice is between two alternatives:

  1. a CA module in the TD issues separate certificates to each app on a subdomain
  2. a reverse proxy in the TD has a single “wildcard” certificate, but it routes based on subdomain in the request

There doesn’t appear to be a benefit to 1. so I think I prefer 2. but interested in any discussion on it

1 Like

Just a quick heads up. The Phala team have built a Docker simulator based on the dstack concept and used it in the ETHGlobal SF hackathon to illustrate the developer journey. It’s used in the hackathon guide.

1 Like

dstack fan art from @Fred

2 Likes