Ship Your First Managed Agent: Agent, Environment, Session

1. The evolution from API to managed infrastructure

When Claude first launched, the only interface was the Messages API: tokens in, tokens out. Developers who wanted an agent loop had to build it themselves — context management, tool-call routing, compaction, caching, retry logic. When agents could only do simple tasks, this was manageable. As models became capable of multi-step, multi-tool tasks, the primitive complexity exploded.

The Agent SDK gave developers a programmatic way to run Claude Code as an agent. But developers still had to manage hosting and container scaling themselves.

Claude Managed Agents is the first harness where Anthropic manages the infrastructure: sandboxing, compaction, caching, reliability, and scaling are handled for you. You configure the task and tools; the platform runs the loop.

The claim from the workshop: teams are building 10–15x faster to production using Managed Agents compared to rolling their own agent harness.

2. The three primitives

Every Managed Agent is composed of three resources:

The three Managed Agents primitives. The Agent (brain) defines what Claude knows and can do. The Environment (hands) is where tool execution happens. The Session binds them together and streams events to your application.

Agent — the brain

The Agent resource defines what Claude is and what it can do:

Model: which Claude model to use
System prompt: the agent’s persona, task, and constraints
Tool definitions: what tools the agent can call (described in JSON schema)
MCP servers: connected model context protocol servers
Skills: reusable task templates

The agent loop runs server-side on Anthropic’s infrastructure. This is the key architectural decision: when you close your laptop, the agent keeps running. No durability management, no process-hanging-on-your-machine.

Environment — the hands

The Environment resource is where tool execution actually happens. This is deliberately decoupled from the agent loop.

Why the decoupling? Two concrete benefits:

Security: Credentials never touch the same container as the agent’s reasoning. The agent can call a “get_secret” tool; the environment handler fetches the secret from a vault and returns only what the agent needs — the agent process itself never has filesystem access to your secrets.
Latency: Previously, spinning up the agent loop and the tool execution environment in the same container meant every new session waited for both. With the decoupled design, Anthropic saw >90% reduction in P95 time-to-first-token because the agent loop can start immediately.

Session — the binding

A Session ties an Agent to an Environment and starts the agent loop. Sessions:

Persist in the cloud (hard refresh = no data loss)
Stream events to your application
Have states: idle → running → rescheduling (on retry) → terminated
Can be resumed from any state via webhooks or direct API calls

3. Events, not responses

The Messages API gives you tokens in/tokens out. Managed Agents gives you events.

Every meaningful action in a session produces an event: user message received, tool called, tool result returned, agent response started, agent response completed. Events are appended to the session log and streamed to your application in real time.

Why this matters: if a container goes down mid-task, the session log is intact. The harness spins up a fresh container and continues from the last event. No lost work, no “start over.” This is what production-grade durability looks like.

Your application receives these events via streaming and can surface them to users — progress updates, intermediate results, tool outputs — as they happen instead of waiting for the full response.

Check your understanding

3 questions · your answers are saved in this browser only

1. Why is the agent loop deliberately decoupled from tool execution in Claude Managed Agents?

Coupling the agent loop with tool execution in one container requires spinning up both together — adding cold-start latency and mixing the agent's reasoning process with credential access. Decoupling solves both: the loop starts immediately, and credentials only touch the isolated execution container.
2. What happens to a Managed Agent session when a container goes down mid-task?

Events are appended to a durable session log server-side. Container failure doesn't affect the log. The harness resumes from the last committed event — users see a brief pause, not a restart.
3. What is the purpose of "local tool" functions in the workshop's architecture?

Local tools let you run tool execution wherever your data lives — a local DataDog client, a private API, a database — while keeping the agent loop server-side. The wire protocol connects them.

4. Local tools: connecting your infrastructure

In the workshop, the incident-response agent calls tools like get_metrics, get_recent_deploys, and get_diff. These execute locally — simulated from JSON files in the demo, but in production they’d call DataDog, PagerDuty, or GitHub.

The pattern: you define the tools in the Agent (their JSON schema, what they do), and you handle their execution locally in your application. When the agent calls a tool, an event arrives in your stream; you execute the function, return the result; the harness continues the loop.

def handle_tool_call(event):
    tool_name = event.tool_name
    tool_input = event.tool_input

    if tool_name == "get_metrics":
        return fetch_from_datadog(tool_input["service"], tool_input["window"])
    elif tool_name == "get_recent_deploys":
        return fetch_from_github_releases(tool_input["repo"])
    elif tool_name == "get_diff":
        return fetch_git_diff(tool_input["commit_sha"])

    raise ValueError(f"Unknown tool: {tool_name}")

This is the migration path from prototype to production: replace JSON file reads with real client calls, one tool at a time. The agent’s system prompt and behavior don’t change; only the tool implementations do.

5. What the platform handles for you

The workshop emphasizes what you don’t have to build:

Compaction: when context gets long, the harness compacts it automatically — you don’t manage context windows
Caching: prompt caching is handled at the harness level for cost and latency
Retry logic: rescheduling state handles transient failures without your intervention
Session persistence: the conversation is in the cloud; hard refresh, laptop close, network blip — all transparent to the user
Observability: the Managed Agents console shows every event, tool call, and response — the full agent trajectory, inspectable at any time

What you control: the task definition, the system prompt, the tools the agent can call, and the custom tool logic that connects to your infrastructure.

6. Beyond the basics

The workshop briefly covers what’s available beyond the core primitives:

Subagents: an orchestrator agent spins up sub-agents with their own context windows for parallelism
Memory: persistent memory stores that bridge information across sessions (see Agents That Remember)
Outcomes: define a rubric for what success looks like; the agent drives toward the outcome, not just executes steps
Vaults: encrypted credential storage decoupled from agent access — credentials are fetched by the environment on demand, never persisted in the agent’s context
Webhooks: external events (a GitHub webhook, a PagerDuty alert) can wake a session or trigger a new one

Build it yourself

Follow these exact steps to reproduce it yourself · estimated time: ~90 minutes

Prerequisites

An Anthropic API key with Claude Managed Agents access
Python 3.10+ and pip
One real data source you care about (could be a GitHub repo, a log file, a database)

Build a minimal agent that connects to one real data source and can answer questions about it.

Step 1 — Set up

mkdir my-managed-agent && cd my-managed-agent
python3 -m venv .venv && source .venv/bin/activate
pip install anthropic streamlit
cp .env.example .env   # add ANTHROPIC_API_KEY

Step 2 — Define your agent

import anthropic

client = anthropic.Anthropic()

agent = client.beta.managed_agents.agents.create(
    name="my-data-agent",
    model="claude-opus-4-8",
    system_prompt="""You are an assistant that helps engineers understand system state.
    
You have access to tools that fetch real data. Always fetch data before answering;
do not speculate about current state. When you find issues, explain root cause 
and suggest remediation steps.""",
    tools=[
        {
            "name": "get_recent_logs",
            "description": "Fetch the last N log lines from the service",
            "input_schema": {
                "type": "object",
                "properties": {
                    "service": {"type": "string", "description": "Service name"},
                    "lines": {"type": "integer", "default": 50},
                },
                "required": ["service"],
            },
        }
    ],
)
print(f"Agent created: {agent.id}")

Step 3 — Define your environment

env = client.beta.managed_agents.environments.create(
    name="my-env",
    networking={"allowed_domains": ["*"]},  # restrict in production
)
print(f"Environment: {env.id}")

Step 4 — Implement your local tool

def get_recent_logs(service: str, lines: int = 50) -> str:
    # Replace with your real data source:
    # - Read from a log file
    # - Query CloudWatch, Datadog, or Grafana
    # - Hit an internal API
    
    # Minimal example: read a local log file
    log_path = f"/var/log/{service}.log"
    try:
        with open(log_path) as f:
            content = f.readlines()
        return "".join(content[-lines:])
    except FileNotFoundError:
        return f"[No log file found for {service}]"

TOOL_HANDLERS = {
    "get_recent_logs": lambda inp: get_recent_logs(**inp),
}

Step 5 — Create a session and stream events

def run_query(agent_id: str, env_id: str, user_question: str):
    session = client.beta.managed_agents.sessions.create(
        agent_id=agent_id,
        environment_id=env_id,
        title=f"Query: {user_question[:50]}",
    )

    # Send the user's question
    client.beta.managed_agents.sessions.messages.create(
        session_id=session.id,
        content=user_question,
    )

    # Stream events
    for event in client.beta.managed_agents.sessions.stream(session.id):
        if event.type == "tool_call":
            handler = TOOL_HANDLERS.get(event.tool_name)
            if handler:
                result = handler(event.tool_input)
                client.beta.managed_agents.sessions.tool_results.create(
                    session_id=session.id,
                    tool_call_id=event.tool_call_id,
                    result=result,
                )
        elif event.type == "agent_response":
            yield event.content   # stream to the UI
        elif event.type == "session_completed":
            break

# Usage
for chunk in run_query(agent.id, env.id, "What's happening in the auth service?"):
    print(chunk, end="", flush=True)

Step 6 — Add session persistence

def list_sessions(agent_id: str):
    sessions = client.beta.managed_agents.sessions.list(agent_id=agent_id)
    return [(s.id, s.title, s.status) for s in sessions]

def resume_session(session_id: str, new_message: str):
    client.beta.managed_agents.sessions.messages.create(
        session_id=session_id,
        content=new_message,
    )
    for event in client.beta.managed_agents.sessions.stream(session_id):
        if event.type == "agent_response":
            yield event.content
        elif event.type == "session_completed":
            break

Step 7 — Extend with a second tool

Add a second tool to the Agent definition and a matching handler. Common next tools:

search_deployments — query your deployment history
get_error_rate — fetch error rate from your APM
list_open_incidents — query PagerDuty or OpsGenie

Each new tool makes the agent more capable without changing the session or environment.

Production checklist

Before going live:

Restrict allowed_domains in the environment to only what your tools need
Store credentials in Vaults, not in tool handler code
Add session deletion on user logout: client.beta.managed_agents.sessions.delete(session_id)
Set up webhook triggers for event-driven activation (new incident → new session)
Review session logs via the console observability dashboard before go-live

Where to go next

Agents That Remember — add persistent memory to your agent so it learns from past sessions
Watch the original workshop to see Isabella walk through the incident-response agent live
Claude Managed Agents documentation for the full API reference including Vaults, Outcomes, and Webhooks

Ship Your First Managed Agent: Agent, Environment, Session

1. The evolution from API to managed infrastructure

2. The three primitives

Agent — the brain

Environment — the hands

Session — the binding

3. Events, not responses

Check your understanding

4. Local tools: connecting your infrastructure

5. What the platform handles for you

6. Beyond the basics

Build it yourself

Step 1 — Set up

Step 2 — Define your agent

Step 3 — Define your environment

Step 4 — Implement your local tool

Step 5 — Create a session and stream events

Step 6 — Add session persistence

Step 7 — Extend with a second tool

Production checklist

Where to go next

Related lessons

Giving Agents Their Own Computers

Routines, CI Autofix, and the Advisor Strategy

Trustworthy Agentic Workflows with a Custom DSL

1. The evolution from API to managed infrastructure

2. The three primitives

Agent — the brain

Environment — the hands

Session — the binding

3. Events, not responses

🧠 Check your understanding

4. Local tools: connecting your infrastructure

5. What the platform handles for you

6. Beyond the basics

🛠️ Build it yourself

Step 1 — Set up

Step 2 — Define your agent

Step 3 — Define your environment

Step 4 — Implement your local tool

Step 5 — Create a session and stream events

Step 6 — Add session persistence

Step 7 — Extend with a second tool

Production checklist

Where to go next

Related lessons

Giving Agents Their Own Computers

Routines, CI Autofix, and the Advisor Strategy

Trustworthy Agentic Workflows with a Custom DSL

Check your understanding

Build it yourself