Self-Hosting AI Agent Sandboxes: Alternatives to E2B's Cloud-Only Approach

If you're running AI agents that execute code, you need sandboxing. Agents that spin up arbitrary processes, write files, or make network calls can't share a host with production workloads. The cloud-only options mean your data and code leave your infrastructure. This post explains the problem, compares the leading options, and covers what self-hosting your own sandbox looks like -- including the hardware requirements and the networking gotchas that will break your isolation if you miss them.

The sandboxing problem for AI agents

AI agents execute untrusted code: LLM-generated shell commands, Python scripts, arbitrary file writes. A container with no special configuration is not an adequate boundary for this. Container escapes are real, and namespaces plus cgroups don't change the fact that all containers on the same host share a kernel. One malicious or buggy system call can affect the host.

For agent sandboxing you need four properties:

Kernel isolation -- each agent session runs in its own kernel, not just its own namespace
Ephemeral lifecycle -- clean state per agent run, no state leakage between sessions
Network policy control -- restrict what the agent can reach on the network
Fast startup -- cold start latency directly adds to agent loop latency; a 30-second boot makes interactive agents unusable

Technology	Isolation model	Cold start	KVM required?
runc (Docker)	Namespace/cgroup only	50-200ms	No
gVisor	Userspace kernel (Sentry) intercepts syscalls	50-150ms	No
Kata + Firecracker	Dedicated VM kernel per container	150-300ms	Yes
Kata + QEMU	Dedicated VM kernel per container	500ms-2s	Yes
Traditional VM	Full hardware isolation	30-90s	Yes

The managed cloud options

E2B

E2B's runtime is Firecracker microVMs -- each sandbox gets its own Linux kernel, not just a container. Cold start is ~150ms using pre-warmed snapshot restore. The SDK covers Python and JavaScript/TypeScript.

Pricing: Hobby tier is free. Pro is $150/month plus usage at $0.000014/vCPU-second. Enterprise pricing is not public.

One hard limitation: no GPU support. Firecracker doesn't support PCIe passthrough. The core constraint for compliance-sensitive teams: sandboxes run on E2B's infrastructure. Your code and agent context leave your infra.

Modal

Modal uses gVisor -- a userspace kernel, not a microVM. Cold start is sub-second (500ms-1s typical). The distinctive feature is GPU support -- H100, H200, A100 on-demand. Pricing is ~$0.12/vCPU-hour with $30/month in free credit. Best for stateless compute or ML workloads.

Daytona

Daytona raised $24M in a Series A in February 2026. Runtime is Docker/OCI containers by default with Kata available. Claimed cold start is ~90ms. GPU support is included. Pricing is $0.067/vCPU-hour with no session time limits.

Morph Cloud

Morph Cloud's differentiator is "Infinibranch" -- snapshot and fork an entire running VM state in under 250ms. This enables parallel agent exploration: you can run multiple plan branches simultaneously without re-spinning environments from scratch. Best suited for tree-search workflows (MCTS) and multi-agent parallel exploration.

Cloudflare Dynamic Workers (open beta, March 2026)

Cloudflare Dynamic Workers use V8 isolates with millisecond cold starts (~5ms). Pricing is $0.002/worker loaded/day. The trade-off: JavaScript and TypeScript only -- no Python, no arbitrary binaries. V8 isolates are process-level isolation, not hardware-level.

The self-hosted alternative: Kata microVMs on your own hardware

What Kata containers are

Kata Containers provides OCI-compatible containers backed by a lightweight VM. Each container gets its own Linux kernel -- hardware-enforced isolation, not just namespaces.

Kata supports three VM backends:

Firecracker (kata-fc) -- fastest cold start (150-300ms), limited device model. No GPU, no PCIe passthrough. Requires devmapper storage.
Cloud Hypervisor (kata-clh) -- fast cold start, broader device support than Firecracker.
QEMU (kata-qemu) -- full device compatibility, slower (500ms-2s), full GPU passthrough support.

A Kubernetes Job using the Firecracker backend:

apiVersion: batch/v1
kind: Job
metadata:
  name: agent-sandbox-run
spec:
  template:
    spec:
      runtimeClassName: kata-fc
      restartPolicy: Never
      containers:
        - name: agent
          image: your-agent-image:latest
          resources:
            requests:
              cpu: "2"
              memory: "4Gi"
            limits:
              cpu: "2"
              memory: "4Gi"

Hardware requirements

This matters: Kata requires KVM. That means hardware virtualization -- Intel VT-x or AMD-V -- must be accessible to the host.

Bare metal hosts -- work natively. This is the straightforward path.
Standard cloud VMs (most AWS EC2, most GCP instances) -- do not support nested virtualization. Kata needs bare metal instances (e.g., AWS EC2 Metal) or a cloud provider that explicitly enables nested virt.
Physical x86_64 machines -- Kata works natively if VT-x or AMD-V is enabled in firmware.

The networking warning you can't skip

Kubernetes NetworkPolicy objects only work if your CNI plugin enforces them. Flannel -- the default CNI for k3s -- does not enforce NetworkPolicy. If you create NetworkPolicy objects on a Flannel cluster, they are silently accepted and silently ignored. Your agent pods have unrestricted network access regardless of what the policy says.

To actually enforce egress restrictions on agent sandboxes:

Replace Flannel with Calico or Cilium as your CNI
Or add Calico in policy-only mode alongside Flannel

This is not optional if network isolation is part of your threat model.

gVisor vs. Kata: which one for agent sandboxing?

Factor	gVisor	Kata + Firecracker
Isolation model	Userspace kernel intercepts all syscalls	Full VM per container -- separate kernel
Cold start	50-150ms	150-300ms
KVM required	No	Yes
GPU support	No	No (Firecracker); Yes with QEMU
syscall compatibility	~70-80% of Linux syscalls	Full Linux kernel -- complete compatibility
Performance overhead	10-30% on I/O-heavy workloads	5-15% from VM overhead
Best for	Environments without KVM; compute-bound workloads	Untrusted arbitrary code; compliance requirements

gVisor's syscall coverage (~70-80%) means some workloads that depend on less common syscalls will fail silently or with confusing errors. Kata runs a real Linux kernel, so syscall compatibility is complete. For agents executing arbitrary code where you can't predict the syscall surface in advance, Kata is the stronger choice.

What Stratus does (honest current state)

Stratus has experimental Kata microVM support on bare metal. The manifests exist for ephemeral Kata Jobs with network isolation.

What exists today:

Kata microVM runtime installed and functional on the bare metal node (kata-fc and kata-clh RuntimeClasses active)
Ephemeral Job manifests for isolated agent execution
Same infrastructure as the production CI/CD routing layer (see the ARC setup for how the runner fleet is structured)

What's not yet ready:

No public API surface for agent sandbox dispatch -- you can't call an endpoint and get back a sandbox handle
Kata + ARC runner integration has known kubectl exec limitations: runner-to-workflow-pod communication breaks in Kata pods, which affects certain ARC lifecycle hooks
Network isolation is not enforced in the current k3s cluster -- the cluster uses Flannel, which means NetworkPolicy objects are present but not enforced. A CNI migration to Calico or Cilium is required before sandbox network isolation is production-grade

If you're building on self-hosted Kubernetes today and want real isolation: install Calico or Cilium as your CNI before deploying any agent sandbox workloads. The Kata runtime is table stakes; the network enforcement is where most self-hosted setups fall short.

We're building the self-hosted sandbox layer as part of Stratus -- ephemeral Kata microVM sandboxes on your own hardware, no data leaving your infrastructure. If you're working on agent infrastructure and want early access:

Join the waitlist →