Skip to content

Architecture

PizzaPi is built around a relay pattern: the CLI on your dev machine connects outward to the relay server over persistent WebSockets, so the relay never needs to reach into your firewall. Two channels connect the runner to the relay: the SIO WebSocket for the agent control plane (session events, triggers, service announcements) and the /_tunnel WebSocket for streaming HTTP proxy traffic (service panel iframes, video, SSE, large downloads).

┌────────────────────────────────────────────────────────┐
│ Your Dev Machine │
│ │
│ ┌───────────────────────────────────────────────────┐ │
│ │ pizzapi CLI │ │
│ │ ├─ pi coding agent (LLM + tools) │ │
│ │ ├─ sandbox layer (OS-level enforcement) │ │
│ │ ├─ remote extension (event serializer) │ │
│ │ ├─ tunnel client (streaming HTTP proxy) │ │
│ │ └─ runner daemon (optional) │ │
│ └──────────┬────────────────────┬───────────────────┘ │
└─────────────┼────────────────────┼─────────────────────┘
│ SIO WebSocket │ /_tunnel WebSocket
│ (agent control) │ (streaming HTTP)
▼ ▼
┌────────────────────────────────────────────────────────┐
│ Relay Server │
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌────────────┐ │
│ │ Bun HTTP/WS │ │ Redis │ │ SQLite DB │ │
│ │ + tunnel │──►│ event buffer│ │ (users, │ │
│ │ relay │ │ │ │ sessions)│ │
│ └──────┬───────┘ └──────────────┘ └────────────┘ │
│ │ │
└─────────┼──────────────────────────────────────────────┘
│ HTTP + WebSocket
┌────────────────────────────────────────────────────────┐
│ Browser / Mobile Web UI │
│ ├─ Live session stream (tokens, tool calls, diffs) │
│ ├─ Send messages to the agent │
│ ├─ Manage runners and spawn sessions │
│ ├─ Service panel iframes (via tunnel) │
│ └─ API key management and settings │
└────────────────────────────────────────────────────────┘

The monorepo is organized into focused packages:

PackageDescription
packages/protocolShared TypeScript types for the relay wire protocol
packages/tunnelStreaming HTTP/WebSocket tunnel library — proxies service traffic between runner and relay
packages/toolsShared agent tools (bash, read-file, write-file, search) and sandbox enforcement
packages/serverBun HTTP + WebSocket relay server (auth, sessions, push notifications)
packages/uiReact 19 PWA web interface (Vite, TailwindCSS v4, Radix UI)
packages/cliCLI wrapper — launches pi with PizzaPi extensions, the runner daemon, and the Claude Code plugin adapter
packages/control-planeMulti-tenant provisioner — org management, JWT auth, Caddy config, and database migrations for hosted deployments
packages/docsThis documentation site (Astro Starlight)
packages/npmnpm distribution tooling — builds and publishes the npx pizzapi packages

Build order: protocoltunneltoolsserveruicli


LayerTechnology
Runtime / package managerBun
LanguageTypeScript (strict, ESM)
ServerBun.serve, better-auth, Kysely + SQLite, Redis, web-push
UIReact 19, Vite 6, TailwindCSS v4, Radix UI, shadcn/ui
Agent core@mariozechner/pi-coding-agent, @mariozechner/pi-ai

When you run pizzapi:

  1. The CLI starts a pi agent session with PizzaPi extensions injected
  2. The remote extension opens a WebSocket to the relay server, authenticated with your API key
  3. Every agent event (tokens, tool calls, file writes, errors) is serialized and sent over that WebSocket
  4. The relay server buffers recent events in Redis and broadcasts to any connected browser clients
  5. The web UI receives events and renders them in real time — tokens stream as they arrive, tool calls show inputs/outputs, file changes appear as diffs
Agent event
remote extension
│ serialize → JSON
WebSocket → Relay Server
├──► Redis (ring buffer, configurable size via PIZZAPI_RELAY_EVENT_BUFFER_SIZE)
└──► WebSocket broadcast → all connected browser tabs

The web UI also displays a context window donut graph in the chat footer, showing real-time token usage (input, output, cache) as a proportion of the model’s context window. This updates live as the conversation progresses.


The runner maintains a dedicated /_tunnel WebSocket connection to the relay alongside the main SIO connection. This channel handles all HTTP proxy traffic for runner services — service panel iframes, video streams, SSE, and large downloads.

Components:

  • TunnelRelay (server-side, packages/tunnel) — manages runner connections, proxies incoming HTTP requests with streaming callbacks, and bridges WebSocket connections
  • TunnelClient (runner-side, packages/tunnel) — connects to the relay, streams local HTTP responses from service ports, and bridges WebSocket connections to local services
Runner Relay Browser
│ │ │
├── SIO WebSocket ────────►│◄── SIO WebSocket ────────┤ (agent events)
│ │ │
├── /_tunnel WS ──────────►│ │
│ (TunnelClient) │ (TunnelRelay) │
│ │ │
│ local HTTP service │ GET /api/tunnel/... ───┤ (service panel)
│ ◄── stream chunks ───►│◄── stream chunks ───────►│

Key improvements over the previous buffered proxy:

  • Response chunks stream via ReadableStream as they arrive — no full-body buffering
  • No size cap — video, large downloads, and SSE all work
  • Dedicated WebSocket channel (not muxed on the SIO control connection)
  • Native ws library handles framing (replaced hand-rolled RFC 6455 parser)

When a browser requests /api/tunnel/{sessionId}/{port}/{path}, the relay forwards the request through the tunnel to the runner’s local HTTP service on that port, and streams the response back.


The server uses better-auth for account management:

  • Email/password registration and login
  • Session cookies for the web UI
  • API keys for the CLI and runner — issued at registration, stored in ~/.pizzapi/config.json

All WebSocket connections from the CLI require a valid API key in the connection headers.


See the Runner Daemon guide for the full process hierarchy diagram. In summary:

  • The supervisor (pizzapi runner) is the stable outer process
  • It spawns the daemon as a child and restarts it on crash
  • The daemon spawns one worker per session request, isolated so a crash never affects other sessions

The runner includes a usage analytics dashboard that tracks token consumption and costs across sessions.

  • Data pipeline: A scanner processes session JSONL files from ~/.pizzapi/agent/sessions/ into a local SQLite database with daily rollups, model breakdowns, and project breakdowns
  • Server endpoint: GET /api/runners/:id/usage?range=7d|30d|90d|all serves aggregated data to the UI
  • Web UI: Interactive charts showing token usage over time, cost breakdown, model distribution, and sessions by project — lazy-loaded via React.lazy to keep recharts out of the main bundle

The CLI includes a plugin adapter that discovers Claude Code plugins and maps them into pi’s runtime:

  • Commands (commands/*.md) → pi slash commands (/plugin-name:command)
  • Hooks (hooks/hooks.json) → pi lifecycle events (tool_call, tool_result, input, etc.)
  • Skills (skills/*/SKILL.md) → pi skills (loaded via the Agent Skills standard)

The adapter scans global dirs (~/.pizzapi/plugins/, ~/.agents/plugins/, ~/.claude/plugins/) automatically. Project-local plugins require explicit trust — managed via pizza plugins trust or interactive confirmation at session start.

See the Claude Code Plugins guide for full details.


Runner services are background processes that run on the runner and expose interactive UI panels in the web interface. Each service ships its own HTML panel rendered in an iframe — no React compilation needed. Services are discovered automatically from ~/.pizzapi/services/ and from plugins.

See the Runner Services guide for full details on creating custom services.


The CLI includes a subagent tool that delegates tasks to child agent processes with isolated context windows:

Parent agent session
▼ subagent tool call
├──► Discover agents from ~/.pizzapi/agents/ and .pizzapi/agents/
├──► Spawn `pi --mode json --no-session` child process
│ ├── Isolated context window (own token budget)
│ ├── Scoped tools (e.g., read-only for research agents)
│ └── Optional model override (e.g., Haiku for cheap tasks)
├──► Stream progress updates to parent (onUpdate callback)
│ └── Relay forwards to web UI for inline rendering
└──► Return final output as tool result
└── Only the summary enters the parent's context

Modes:

  • Single — one agent, one task
  • Parallel — up to 4 concurrent agents (max 8 tasks)
  • Chain — sequential steps with {previous} output substitution

See the Subagents guide for full details.


For a complete guide to setting up a local development environment, running tests, and contributing, see the Development guide.