Quickstart - Unpod AI

Talk to your own voice agent in the browser in about 5 minutes. No phone number needed. Unpod hosts the speech service - microphone capture, speech-to-text, text-to-speech, and the audio bridge. You write the agent’s brain and run a small browser UI on your machine to speak with it. This is the fastest way to hear your agent. When you want a real phone call, follow Make Your First Phone Call afterwards. For production, follow the steps below and use the deploy checklist for the runner-specific settings.

Unpod

Phone numbers, STT, TTS, routing, call control

Your App

Agent brain, tools, prompts, business logic

Production

Public runner, secrets, scaling, shutdown, observability

Step 1 - Install the SDK

pip install "unpod[dialog]"

uv add "unpod[dialog]"

Source on GitHub: unpod-ai/unpod-python-sdk.

Step 2 - Set your environment

Where to get your keys:

UNPOD_API_KEY - unpod.ai/api-keys → create a new key
ANTHROPIC_API_KEY - your own key from console.anthropic.com → API Keys. Use any provider you want - OpenAI, Gemini, etc. The variable name is just an example.

Create a .env file in your project root and add the API keys:

Local Dev
Production

UNPOD_API_KEY="sk_..."
ANTHROPIC_API_KEY="sk-ant-..."

# Local dev overrides - remove for production
UNPOD_SERVICE_BASE_URL=http://localhost:8000/platform
UNPOD_ORCHESTRATOR_URL=ws://localhost:8000

UNPOD_API_KEY="sk_..."
ANTHROPIC_API_KEY="sk-ant-..."
UNPOD_BASE_URL="unpod.ai"

# Runner hardening
UNPOD_AGENT_SECRET="long-random-secret"
UNPOD_RUNNER_URL="wss://agents.example.com:8765"

UNPOD_BASE_URL is the shared default for production. The SDK derives:

REST: https://<host>/platform
Runner/orchestrator: wss://<host>

For local dev, set UNPOD_SERVICE_BASE_URL and UNPOD_ORCHESTRATOR_URL separately - UNPOD_BASE_URL does not work for local services.

If you use a different dialog provider, set its key here instead of ANTHROPIC_API_KEY.

Load .env before from unpod import .... The SDK reads env vars at import time.

Step 3 - Pick a Voice Profile

from dotenv import load_dotenv
load_dotenv(override=True)

import asyncio
from unpod import AsyncClient

async def main():
    async with AsyncClient() as client:
        profiles = await client.voice_profiles.list(language="en")
        for p in profiles:
            print(p.id, p.name, p.gender, p.quality)

asyncio.run(main())

Available profiles look like this. Copy the agent_profile_id you want and use it as voice_profile in Step 4:

Step 4 - Create a Speech Pipe

A Speech Pipe ties a voice profile to your agent. The agent_id is a string you choose - it links the pipe to the AgentRunner you run in Step 7T.

from dotenv import load_dotenv
load_dotenv(override=True)

import asyncio
from unpod import AsyncClient




async def main():
    async with AsyncClient() as client:
        pipe = await client.pipes.create(
            name="Support Bot",
            voice_profile="vp_riya",         # profile_id from Step 3 (p.id)
            agent_id="my-support-agent",     # your identifier - must match AgentRunner
            recording=True,
            max_call_duration_s=600,
        )
        print("Pipe ID:", pipe.pipe_id)

asyncio.run(main())

Save the pipe.pipe_id - you will use it in the steps below. A pipe without agent_id will show as degraded and calls will not be dispatched to your runner.

Step 5 - Build a Dialog Flow (optional)

create_dialog_flow generates a structured conversation graph (nodes + edges) from a plain-English description. Run it once and save the result.

from dotenv import load_dotenv
load_dotenv(override=True)

import asyncio
from superdialog import create_dialog_flow

async def main():
    flow = await create_dialog_flow(
        prompt=(
            "You are Alex, a friendly support agent. "
            "Greet the user, ask how you can help, resolve their issue, "
            "and end the conversation when done."
        ),
        llm="anthropic/claude-haiku-4-5-20251001",  # or "openai/gpt-4o-mini"
    )
    flow.save("support.json")

asyncio.run(main())

This writes support.json. Use it in your entrypoint with DialogMachine(flow=Flow.load("support.json"), llm="...") for structured, node-based conversations. For a simpler prompt-only agent, skip this step and use AnthropicAdapter or LLMAgent directly in Step 7T.

Telephony Path

Step 6T - Attach a Phone Number

from dotenv import load_dotenv
load_dotenv(override=True)

import asyncio
from unpod import AsyncClient

PIPE_ID = "pipe_..."
NUMBER_ID = "num_..."   # from client.numbers.list()

async def main():
    async with AsyncClient() as client:
        number = await client.numbers.attach(NUMBER_ID, PIPE_ID)
        print("Attached:", number.number, "->", number.pipe_id)

asyncio.run(main())

See Phone Numbers for how to list, provision, or bring your own number.

Step 7T - Write Your Entrypoint

agent_id is a string you define (e.g. "my-support-agent"). It must be identical in both pipes.create() (Step 4) and AgentRunner below - this is how Unpod routes inbound calls to your runner.

Create agent.py:

from dotenv import load_dotenv
load_dotenv(override=True)

import os
from anthropic import AsyncAnthropic
from unpod import AgentRunner, CallContext
from unpod.adapters import AnthropicAdapter

async def handle_call(ctx: CallContext) -> None:
    ctx.session.dialog_machine = AnthropicAdapter(
        client=AsyncAnthropic(),
        model="claude-haiku-4-5-20251001",
        system_prompt=(
            "You are Alex, a friendly support agent. "
            "Greet the user, ask how you can help, resolve their issue, "
            "and end the conversation when done."
        ),
    )
    await ctx.session.run()

AgentRunner(
    entrypoint=handle_call,
    agent_id="my-support-agent",    # must match agent_id in pipes.create()
    api_key=os.getenv("UNPOD_API_KEY"),
    max_sessions=10,
    agent_secret=os.getenv("UNPOD_AGENT_SECRET"),
    serving_url=os.getenv("UNPOD_RUNNER_URL"),
).start()

Using the flow from Step 5 instead

If you ran Step 5, swap AnthropicAdapter for DialogMachine to drive a structured node-based conversation:

from dotenv import load_dotenv
load_dotenv(override=True)

import os
from unpod import AgentRunner, CallContext
from superdialog import DialogMachine, Flow

async def handle_call(ctx: CallContext) -> None:
    ctx.session.dialog_machine = DialogMachine(
        flow=Flow.load("support.json"),
        llm="anthropic/claude-haiku-4-5-20251001",
    )
    await ctx.session.run()

AgentRunner(
    entrypoint=handle_call,
    agent_id="my-support-agent",
    api_key=os.getenv("UNPOD_API_KEY"),
    max_sessions=10,
    agent_secret=os.getenv("UNPOD_AGENT_SECRET"),
    serving_url=os.getenv("UNPOD_RUNNER_URL"),
).start()

Step 8T - Run and Call

python3.12 agent.py

Call the number you attached. The runner connects to the Unpod orchestrator and dispatches inbound calls to your entrypoint. The process stays running and accepts calls until you stop it.

Web / App Path

Step 6W - Create a Session Token

Your backend generates a short-lived session token for each user. The browser uses this token to connect.

from dotenv import load_dotenv
load_dotenv(override=True)

import asyncio
from unpod import AsyncClient

async def create_session(pipe_id: str, user_id: str) -> str:
    async with AsyncClient() as client:
        token = await client.sessions.create_token(
            pipe_id=pipe_id,
            metadata={"user_id": user_id},
        )
        return token.token

asyncio.run(create_session("pipe_...", "usr_123"))

Return this token to your frontend. Tokens are single-use - check token.expires_at for the actual expiry time.

Step 7W - Open the Local Browser Demo

No npm package needed for local dev. Open the Supervoice web dashboard:

http://localhost:3100

Point it at your pipe - it handles microphone capture, session token exchange, and the audio bridge. The same AgentRunner from Step 7T handles both phone and browser sessions with no changes.

@unpod/web-sdk is not yet published on npm. localhost:3100 is the local Supervoice UI - start it before opening the browser demo.

Step 8W - Run Your Agent

python3.12 agent.py

The runner accepts sessions from both telephony and web SDK connections on the same agent_id.

What Unpod Owns vs. What You Own

Unpod owns	You own
Numbers, telephony, SIP trunks	Prompt + system message
STT (speech → text)	Tool calls + RAG context
Routing, threading, retries	Customer data + memory
TTS (text → speech)	Compliance + audit log
WebSocket bridge	Model choice (Claude, OpenAI, …)

Next Steps

Structured flows

Nodes, edges, router nodes, and HTTP actions inside your flow graph.

Bring your own agent

Plug in LangChain, an HTTP webhook, or any brain you already have.

Production checklist

Trunks, numbers, recording, deploy - the full production path.

Unpod

Your App

Production

​Step 1 - Install the SDK

​Step 2 - Set your environment

​Step 3 - Pick a Voice Profile

​Step 4 - Create a Speech Pipe

​Step 5 - Build a Dialog Flow (optional)

​Telephony Path

​Step 6T - Attach a Phone Number

​Step 7T - Write Your Entrypoint

​Step 8T - Run and Call

​Web / App Path

​Step 6W - Create a Session Token

​Step 7W - Open the Local Browser Demo

​Step 8W - Run Your Agent

​What Unpod Owns vs. What You Own

​Next Steps

Structured flows

Bring your own agent

Production checklist

Step 1 - Install the SDK

Step 2 - Set your environment

Step 3 - Pick a Voice Profile

Step 4 - Create a Speech Pipe

Step 5 - Build a Dialog Flow (optional)

Telephony Path

Step 6T - Attach a Phone Number

Step 7T - Write Your Entrypoint

Step 8T - Run and Call

Web / App Path

Step 6W - Create a Session Token

Step 7W - Open the Local Browser Demo

Step 8W - Run Your Agent

What Unpod Owns vs. What You Own

Next Steps