Confidential Computing: Running AI Workloads on Untrusted Infrastructure

The Trust Problem in AI-as-a-Service

As organizations rush to adopt AI, a critical question emerges: How do you protect sensitive training data and inference requests when they run on infrastructure you don’t fully control?

Whether you’re a healthcare provider processing patient data, a financial institution analyzing transactions, or an enterprise with proprietary models — the moment your data hits the cloud, you’re trusting someone else’s security. Traditional encryption protects data at rest and in transit, but during processing? It’s decrypted and vulnerable.

Enter Confidential Computing — the ability to process encrypted data without ever exposing it, even to the infrastructure operator.

How Confidential Computing Works

At its core, Confidential Computing creates hardware-enforced Trusted Execution Environments (TEEs) — isolated enclaves where code and data are protected from everything outside, including the hypervisor, host OS, and even physical access to the machine.

The Key Technologies

  • Intel TDX (Trust Domain Extensions) — VM-level isolation with encrypted memory, hardware-attested trust
  • AMD SEV-SNP (Secure Encrypted Virtualization – Secure Nested Paging) — Memory encryption with integrity protection against replay attacks
  • ARM CCA (Confidential Compute Architecture) — Realms-based isolation for ARM processors
  • NVIDIA Confidential Computing — GPU TEEs for accelerated AI workloads

The magic: cryptographic attestation proves to you — remotely and verifiably — that your workload is running in a genuine TEE with the exact code you intended.

Why This Matters for AI

AI workloads are uniquely sensitive:

Asset Risk Without Protection
Training Data PII exposure, regulatory violations, competitive intelligence leak
Model Weights IP theft, model extraction attacks
Inference Requests User privacy violations, business data exposure
Inference Results Sensitive predictions leaked to adversaries

Confidential Computing addresses all four — your data is encrypted in memory, your model is protected, and neither the cloud provider nor a compromised admin can see what’s happening inside the TEE.

Practical Implementation: Confidential Containers

The good news: you don’t need to rewrite your applications. Confidential Containers bring TEE protection to standard Kubernetes workloads.

The Stack

┌─────────────────────────────────────────┐
│           Your AI Application           │
├─────────────────────────────────────────┤
│         Confidential Container          │
│    (encrypted memory, attested boot)    │
├─────────────────────────────────────────┤
│     Kata Containers / Cloud Hypervisor  │
├─────────────────────────────────────────┤
│         AMD SEV-SNP / Intel TDX         │
├─────────────────────────────────────────┤
│          Cloud Infrastructure           │
│    (untrusted - can't see inside TEE)   │
└─────────────────────────────────────────┘

Key Projects

  • Confidential Containers (CoCo) — CNCF sandbox project, integrates with Kubernetes
  • Kata Containers — Lightweight VMs as container runtime, TEE-enabled
  • Gramine — Library OS for running unmodified applications in Intel SGX
  • Occlum — Memory-safe LibOS for Intel SGX

Cloud Provider Support

All major clouds now offer Confidential Computing:

  • Azure — Confidential VMs (DCasv5/ECasv5), Confidential AKS, AMD SEV-SNP & Intel TDX
  • GCP — Confidential VMs, Confidential GKE Nodes, Confidential Space
  • AWS — Nitro Enclaves (different model), upcoming SEV-SNP support

Azure Example: Confidential AKS

az aks create \
  --resource-group myRG \
  --name myConfidentialCluster \
  --node-vm-size Standard_DC4as_v5 \
  --enable-confidential-computing

Your pods now run in AMD SEV-SNP protected VMs — with memory encryption enforced by hardware.

Attestation: Trust But Verify

How do you know your workload is actually running in a TEE? Remote Attestation.

The TEE generates a cryptographic quote — signed by the hardware itself — proving:

  1. The hardware is genuine (not emulated)
  2. The TEE firmware is unmodified
  3. Your specific code/container image is loaded
  4. No tampering occurred during boot

You verify this quote against the hardware vendor’s root of trust before sending any sensitive data.

# Example: Verify attestation before inference
attestation_quote = get_tee_attestation()
if verify_quote(attestation_quote, expected_measurement):
    response = send_inference_request(encrypted_data)
else:
    raise SecurityError("Attestation failed - TEE compromised")

Performance Considerations

Confidential Computing isn’t free:

  • Memory encryption overhead: 2-8% for SEV-SNP, varies by workload
  • Attestation latency: Milliseconds per verification (cache results)
  • Memory limits: TEE-protected memory may have size constraints
  • GPU support: Still maturing — NVIDIA H100 supports Confidential Computing, but ecosystem tooling is catching up

For most AI inference workloads, the overhead is acceptable. Training large models in TEEs remains challenging due to memory constraints.

Use Cases in Regulated Industries

Healthcare

Train diagnostic AI on patient data from multiple hospitals — no hospital sees another’s data, the model improves for everyone.

Finance

Run fraud detection models on transaction data without exposing transaction details to the cloud provider.

Multi-Party AI

Multiple organizations contribute data to train a shared model — Confidential Computing ensures no party can access another’s raw data.

Getting Started

  1. Identify sensitive workloads — Not everything needs TEE protection; focus on regulated data and proprietary models
  2. Choose your cloud — Azure has the most mature Confidential AKS offering today
  3. Start with inference — Confidential inference is easier than confidential training
  4. Implement attestation — Don’t skip verification; it’s the foundation of trust
  5. Monitor performance — Measure overhead in your specific workload

Confidential Computing shifts the trust model fundamentally: instead of trusting your cloud provider’s policies and people, you trust silicon and cryptography. For AI workloads handling sensitive data, that’s a game-changer.