Ship Your First Managed Agent: Agent, Environment, Session
Claude Managed Agents is the fastest path from prototype to production-ready agent. This lesson walks through the three core primitives — Agent (brain), Environment (hands), Session (the binding) — and shows how to wire them into a working incident-response agent.
This lesson is original educational writing based on this video by Claude (published May 26, 2026). All credit for the original content goes to the creators.
1. The evolution from API to managed infrastructure
When Claude first launched, the only interface was the Messages API: tokens in, tokens out. Developers who wanted an agent loop had to build it themselves — context management, tool-call routing, compaction, caching, retry logic. When agents could only do simple tasks, this was manageable. As models became capable of multi-step, multi-tool tasks, the primitive complexity exploded.
The Agent SDK gave developers a programmatic way to run Claude Code as an agent. But developers still had to manage hosting and container scaling themselves.
Claude Managed Agents is the first harness where Anthropic manages the infrastructure: sandboxing, compaction, caching, reliability, and scaling are handled for you. You configure the task and tools; the platform runs the loop.
The claim from the workshop: teams are building 10–15x faster to production using Managed Agents compared to rolling their own agent harness.
2. The three primitives
Every Managed Agent is composed of three resources:
Agent — the brain
The Agent resource defines what Claude is and what it can do:
- Model: which Claude model to use
- System prompt: the agent’s persona, task, and constraints
- Tool definitions: what tools the agent can call (described in JSON schema)
- MCP servers: connected model context protocol servers
- Skills: reusable task templates
The agent loop runs server-side on Anthropic’s infrastructure. This is the key architectural decision: when you close your laptop, the agent keeps running. No durability management, no process-hanging-on-your-machine.
Environment — the hands
The Environment resource is where tool execution actually happens. This is deliberately decoupled from the agent loop.
Why the decoupling? Two concrete benefits:
-
Security: Credentials never touch the same container as the agent’s reasoning. The agent can call a “get_secret” tool; the environment handler fetches the secret from a vault and returns only what the agent needs — the agent process itself never has filesystem access to your secrets.
-
Latency: Previously, spinning up the agent loop and the tool execution environment in the same container meant every new session waited for both. With the decoupled design, Anthropic saw >90% reduction in P95 time-to-first-token because the agent loop can start immediately.
Session — the binding
A Session ties an Agent to an Environment and starts the agent loop. Sessions:
- Persist in the cloud (hard refresh = no data loss)
- Stream events to your application
- Have states: idle → running → rescheduling (on retry) → terminated
- Can be resumed from any state via webhooks or direct API calls
3. Events, not responses
The Messages API gives you tokens in/tokens out. Managed Agents gives you events.
Every meaningful action in a session produces an event: user message received, tool called, tool result returned, agent response started, agent response completed. Events are appended to the session log and streamed to your application in real time.
Why this matters: if a container goes down mid-task, the session log is intact. The harness spins up a fresh container and continues from the last event. No lost work, no “start over.” This is what production-grade durability looks like.
Your application receives these events via streaming and can surface them to users — progress updates, intermediate results, tool outputs — as they happen instead of waiting for the full response.
Check your understanding
3 questions · your answers are saved in this browser only
-
1. Why is the agent loop deliberately decoupled from tool execution in Claude Managed Agents?
-
2. What happens to a Managed Agent session when a container goes down mid-task?
-
3. What is the purpose of "local tool" functions in the workshop's architecture?
4. Local tools: connecting your infrastructure
In the workshop, the incident-response agent calls tools like get_metrics, get_recent_deploys, and get_diff. These execute locally — simulated from JSON files in the demo, but in production they’d call DataDog, PagerDuty, or GitHub.
The pattern: you define the tools in the Agent (their JSON schema, what they do), and you handle their execution locally in your application. When the agent calls a tool, an event arrives in your stream; you execute the function, return the result; the harness continues the loop.
def handle_tool_call(event):
tool_name = event.tool_name
tool_input = event.tool_input
if tool_name == "get_metrics":
return fetch_from_datadog(tool_input["service"], tool_input["window"])
elif tool_name == "get_recent_deploys":
return fetch_from_github_releases(tool_input["repo"])
elif tool_name == "get_diff":
return fetch_git_diff(tool_input["commit_sha"])
raise ValueError(f"Unknown tool: {tool_name}")
This is the migration path from prototype to production: replace JSON file reads with real client calls, one tool at a time. The agent’s system prompt and behavior don’t change; only the tool implementations do.
5. What the platform handles for you
The workshop emphasizes what you don’t have to build:
- Compaction: when context gets long, the harness compacts it automatically — you don’t manage context windows
- Caching: prompt caching is handled at the harness level for cost and latency
- Retry logic:
reschedulingstate handles transient failures without your intervention - Session persistence: the conversation is in the cloud; hard refresh, laptop close, network blip — all transparent to the user
- Observability: the Managed Agents console shows every event, tool call, and response — the full agent trajectory, inspectable at any time
What you control: the task definition, the system prompt, the tools the agent can call, and the custom tool logic that connects to your infrastructure.
6. Beyond the basics
The workshop briefly covers what’s available beyond the core primitives:
- Subagents: an orchestrator agent spins up sub-agents with their own context windows for parallelism
- Memory: persistent memory stores that bridge information across sessions (see Agents That Remember)
- Outcomes: define a rubric for what success looks like; the agent drives toward the outcome, not just executes steps
- Vaults: encrypted credential storage decoupled from agent access — credentials are fetched by the environment on demand, never persisted in the agent’s context
- Webhooks: external events (a GitHub webhook, a PagerDuty alert) can wake a session or trigger a new one
Build it yourself
Follow these exact steps to reproduce it yourself · estimated time: ~90 minutes
Prerequisites
- An Anthropic API key with Claude Managed Agents access
- Python 3.10+ and pip
- One real data source you care about (could be a GitHub repo, a log file, a database)
Build a minimal agent that connects to one real data source and can answer questions about it.
Step 1 — Set up
mkdir my-managed-agent && cd my-managed-agent
python3 -m venv .venv && source .venv/bin/activate
pip install anthropic streamlit
cp .env.example .env # add ANTHROPIC_API_KEYStep 2 — Define your agent
import anthropic
client = anthropic.Anthropic()
agent = client.beta.managed_agents.agents.create(
name="my-data-agent",
model="claude-opus-4-8",
system_prompt="""You are an assistant that helps engineers understand system state.
You have access to tools that fetch real data. Always fetch data before answering;
do not speculate about current state. When you find issues, explain root cause
and suggest remediation steps.""",
tools=[
{
"name": "get_recent_logs",
"description": "Fetch the last N log lines from the service",
"input_schema": {
"type": "object",
"properties": {
"service": {"type": "string", "description": "Service name"},
"lines": {"type": "integer", "default": 50},
},
"required": ["service"],
},
}
],
)
print(f"Agent created: {agent.id}")Step 3 — Define your environment
env = client.beta.managed_agents.environments.create(
name="my-env",
networking={"allowed_domains": ["*"]}, # restrict in production
)
print(f"Environment: {env.id}")Step 4 — Implement your local tool
def get_recent_logs(service: str, lines: int = 50) -> str:
# Replace with your real data source:
# - Read from a log file
# - Query CloudWatch, Datadog, or Grafana
# - Hit an internal API
# Minimal example: read a local log file
log_path = f"/var/log/{service}.log"
try:
with open(log_path) as f:
content = f.readlines()
return "".join(content[-lines:])
except FileNotFoundError:
return f"[No log file found for {service}]"
TOOL_HANDLERS = {
"get_recent_logs": lambda inp: get_recent_logs(**inp),
}Step 5 — Create a session and stream events
def run_query(agent_id: str, env_id: str, user_question: str):
session = client.beta.managed_agents.sessions.create(
agent_id=agent_id,
environment_id=env_id,
title=f"Query: {user_question[:50]}",
)
# Send the user's question
client.beta.managed_agents.sessions.messages.create(
session_id=session.id,
content=user_question,
)
# Stream events
for event in client.beta.managed_agents.sessions.stream(session.id):
if event.type == "tool_call":
handler = TOOL_HANDLERS.get(event.tool_name)
if handler:
result = handler(event.tool_input)
client.beta.managed_agents.sessions.tool_results.create(
session_id=session.id,
tool_call_id=event.tool_call_id,
result=result,
)
elif event.type == "agent_response":
yield event.content # stream to the UI
elif event.type == "session_completed":
break
# Usage
for chunk in run_query(agent.id, env.id, "What's happening in the auth service?"):
print(chunk, end="", flush=True)Step 6 — Add session persistence
def list_sessions(agent_id: str):
sessions = client.beta.managed_agents.sessions.list(agent_id=agent_id)
return [(s.id, s.title, s.status) for s in sessions]
def resume_session(session_id: str, new_message: str):
client.beta.managed_agents.sessions.messages.create(
session_id=session_id,
content=new_message,
)
for event in client.beta.managed_agents.sessions.stream(session_id):
if event.type == "agent_response":
yield event.content
elif event.type == "session_completed":
breakStep 7 — Extend with a second tool
Add a second tool to the Agent definition and a matching handler. Common next tools:
search_deployments— query your deployment historyget_error_rate— fetch error rate from your APMlist_open_incidents— query PagerDuty or OpsGenie
Each new tool makes the agent more capable without changing the session or environment.
Production checklist
Before going live:
- Restrict
allowed_domainsin the environment to only what your tools need - Store credentials in Vaults, not in tool handler code
- Add session deletion on user logout:
client.beta.managed_agents.sessions.delete(session_id) - Set up webhook triggers for event-driven activation (new incident → new session)
- Review session logs via the console observability dashboard before go-live
Where to go next
- Agents That Remember — add persistent memory to your agent so it learns from past sessions
- Watch the original workshop to see Isabella walk through the incident-response agent live
- Claude Managed Agents documentation for the full API reference including Vaults, Outcomes, and Webhooks