An AgentRunner is a long-lived process: it registers with the Unpod
orchestrator over WSS, heartbeats (at an interval the orchestrator assigns on
registration), and serves a bridge connection per call. Provisioning is covered
by the Setup Checklist; this page is what
changes when you leave your laptop.
What production needs
A production deployment should have:
- A public
serving_url so Unpod can reach the runner bridge
- An
agent_secret so inbound bridge connections are signed and verified
- A stable
UNPOD_API_KEY for the management API
- A clear
agent_id shared by every replica of the same agent
- Capacity limits (
max_sessions, permits_per_minute) that match your traffic
- Shutdown handling so active calls can drain before the process exits
- Observability for active calls, failures, and latency
If you are writing docs for app developers, this is the minimum contract:
- Set the environment variables.
- Expose the runner publicly.
- Lock down the bridge with
UNPOD_AGENT_SECRET.
- Run multiple replicas with the same
agent_id when you need scale.
- Restart the process under a supervisor.
Production checklist
1. Reachability - serving_url
The runner serves a per-call bridge that Unpod dials into. It binds to
0.0.0.0:8765 by default; in production, tell the orchestrator where that
bridge is publicly reachable:
export UNPOD_RUNNER_URL="wss://agents.example.com:8765"
(or pass serving_url=... to AgentRunner). The host/port must be reachable
from Unpod.
2. Authentication - agent_secret
With an agent secret set, every inbound bridge connection is HMAC-verified
(signed URLs, replay-protected). Without one, the runner accepts unsigned
connections - acceptable only for local dev.
export UNPOD_AGENT_SECRET="a-long-random-secret"
3. Capacity
AgentRunner(
entrypoint=handle_call,
agent_id="my-agent",
max_sessions=50, # orchestrator never dispatches beyond this
permits_per_minute=120, # rate of new call acceptance
drain_timeout_s=60,
)
4. Scaling - add replicas
Run more processes with the same agent_id - on one machine or many. They
form a pool; the orchestrator load-balances dispatches across replicas by
reported capacity. No shared state, no coordination needed.
# replica 1, replica 2, ... identical:
UNPOD_API_KEY=sk_... UNPOD_AGENT_SECRET=... python agent.py
5. Shutdown and supervision
On SIGTERM the runner stops accepting dispatches and drains active calls for
up to drain_timeout_s before exiting - container- and systemd-friendly.
The runner does not reconnect on its own if the orchestrator WebSocket drops -
run it under a supervisor with restart-on-exit (systemd Restart=always,
Kubernetes restart policy, Docker --restart).
Environment summary
For a production runner, the usual minimum .env looks like this:
UNPOD_API_KEY="sk_..."
UNPOD_AGENT_SECRET="long-random-secret"
UNPOD_BASE_URL="api.unpod.ai"
UNPOD_RUNNER_URL="wss://agents.example.com:8765"
If you override URLs in code instead of .env, keep the same precedence rules:
AgentRunner(base_url=...) overrides UNPOD_ORCHESTRATOR_URL
AgentRunner(serving_url=...) overrides UNPOD_RUNNER_URL
AsyncClient(base_url=...) overrides UNPOD_SERVICE_BASE_URL
Watch it run
Wire up metrics and runner stats before you
need them.