Skip to main content
An AgentRunner is a long-lived process: it registers with the Unpod orchestrator over WSS, heartbeats (at an interval the orchestrator assigns on registration), and serves a bridge connection per call. Provisioning is covered by the Setup Checklist; this page is what changes when you leave your laptop.

What production needs

A production deployment should have:
  • A public serving_url so Unpod can reach the runner bridge
  • An agent_secret so inbound bridge connections are signed and verified
  • A stable UNPOD_API_KEY for the management API
  • A clear agent_id shared by every replica of the same agent
  • Capacity limits (max_sessions, permits_per_minute) that match your traffic
  • Shutdown handling so active calls can drain before the process exits
  • Observability for active calls, failures, and latency
If you are writing docs for app developers, this is the minimum contract:
  1. Set the environment variables.
  2. Expose the runner publicly.
  3. Lock down the bridge with UNPOD_AGENT_SECRET.
  4. Run multiple replicas with the same agent_id when you need scale.
  5. Restart the process under a supervisor.

Production checklist

1. Reachability - serving_url

The runner serves a per-call bridge that Unpod dials into. It binds to 0.0.0.0:8765 by default; in production, tell the orchestrator where that bridge is publicly reachable:
export UNPOD_RUNNER_URL="wss://agents.example.com:8765"
(or pass serving_url=... to AgentRunner). The host/port must be reachable from Unpod.

2. Authentication - agent_secret

With an agent secret set, every inbound bridge connection is HMAC-verified (signed URLs, replay-protected). Without one, the runner accepts unsigned connections - acceptable only for local dev.
export UNPOD_AGENT_SECRET="a-long-random-secret"

3. Capacity

AgentRunner(
    entrypoint=handle_call,
    agent_id="my-agent",
    max_sessions=50,          # orchestrator never dispatches beyond this
    permits_per_minute=120,   # rate of new call acceptance
    drain_timeout_s=60,
)

4. Scaling - add replicas

Run more processes with the same agent_id - on one machine or many. They form a pool; the orchestrator load-balances dispatches across replicas by reported capacity. No shared state, no coordination needed.
# replica 1, replica 2, ... identical:
UNPOD_API_KEY=sk_... UNPOD_AGENT_SECRET=... python agent.py

5. Shutdown and supervision

On SIGTERM the runner stops accepting dispatches and drains active calls for up to drain_timeout_s before exiting - container- and systemd-friendly.
The runner does not reconnect on its own if the orchestrator WebSocket drops - run it under a supervisor with restart-on-exit (systemd Restart=always, Kubernetes restart policy, Docker --restart).

Environment summary

For a production runner, the usual minimum .env looks like this:
UNPOD_API_KEY="sk_..."
UNPOD_AGENT_SECRET="long-random-secret"
UNPOD_BASE_URL="api.unpod.ai"
UNPOD_RUNNER_URL="wss://agents.example.com:8765"
If you override URLs in code instead of .env, keep the same precedence rules:
  • AgentRunner(base_url=...) overrides UNPOD_ORCHESTRATOR_URL
  • AgentRunner(serving_url=...) overrides UNPOD_RUNNER_URL
  • AsyncClient(base_url=...) overrides UNPOD_SERVICE_BASE_URL

Watch it run

Wire up metrics and runner stats before you need them.