Prompting for Agents: Steering Models That Act

1. What an agent is (and when you want one)

Anthropic’s working definition is refreshingly small: an agent is a model using tools in a loop. Three things define it:

an environment it operates in (a codebase, a browser, your CRM),
tools that let it observe and change that environment,
a system prompt defining its goal, constraints and ideal behavior.

The model decides what to do next based on what the environment returned last. That autonomy is the whole point — and the whole risk. Contrast with a workflow, where you hard-code the sequence (classify → extract → format). Workflows are cheaper, faster and more predictable; agents shine where you can’t know the steps in advance.

Use an agent when the task is:

Complex — the path can’t be predicted up front (debugging, research, multi-file changes).
Valuable — the outcome justifies tokens and latency.
Feasible — the model demonstrably can do the individual subtasks; de-risk with small tests.
Recoverable — the cost of an error is acceptable or reversible (or gated behind approval).

A task that fails this checklist deserves a workflow, not an agent. “Don’t build agents for everything” is rule number one.

The agent loop: the model reasons, calls a tool, observes the result, and repeats — bounded by budgets and guardrails — until it can deliver an answer.

2. Prompting an agent is a different sport

A classic prompt scripts a single response. An agent prompt configures behavior across an unpredictable number of steps. The talk’s core advice: think like your agent. Sit where it sits: it wakes up with your system prompt, sees only what tools return, and must decide everything else itself. Prompts fail when they assume context the agent never has.

What goes into a good agent prompt:

Heuristics, not scripts. You can’t enumerate every situation, so teach judgment. Examples from real agent prompts: “start with broad searches, then narrow down”, “prefer primary sources”, “simple questions need under 5 tool calls; hard ones may justify 15”. Each heuristic generalizes across thousands of situations a script would miss.

Budgets and stopping criteria. Agents over-search and over-iterate by default. Give explicit resource guidance — number of tool calls, when an answer is “good enough”, when to give up and report failure honestly. Unbounded loops are how an agent burns 50× the tokens for 2% better answers.

Tool guidance. When two tools overlap, say which to prefer and when. Describe what each tool is for, not just its signature — most “agent bugs” are really tool-description bugs.

Guardrails for irreversibility. Separate read from write. Anything destructive or customer-visible (sending email, deleting records, pushing code) should require explicit approval or simply not be exposed as a tool. Recoverable-by-design beats “hope it behaves”.

Let it think between steps. Enable extended thinking so the agent plans before acting, and reflects after each tool result (“interleaved thinking”): did that search actually answer the question? Should I change strategy? Reflection between tool calls is where agents recover from dead ends instead of doubling down.

Check your understanding

3 questions · your answers are saved in this browser only

1. Which task is the BEST fit for an agent rather than a workflow?

The investigation has no predictable step sequence — each finding determines the next move. The other three have known steps and belong in cheap, predictable workflows.
2. Why give an agent an explicit tool-call budget?

Resource heuristics ("simple questions: under 5 calls") teach the agent to match effort to difficulty instead of looping until the context fills.
3. What is "interleaved thinking" useful for?

Thinking between tool calls lets the agent evaluate evidence quality and change course — the difference between research and stubborn keyword-mashing.

3. Evaluating something that takes 200 steps

You cannot improve an agent you can’t measure, and agents resist naive measurement: two correct runs may take completely different paths. The talk’s guidance:

Start tiny and real. Twenty tasks drawn from actual usage beat five hundred synthetic ones. In the early phase, even a handful of cases with careful manual review reveals most issues.
Grade outcomes, not paths. For questions with verifiable answers, use answer-based grading — did it land on the right final answer? Let the path vary.
Use an LLM judge with a rubric for fuzzy qualities (did it cite sources? was the analysis grounded?). Judges scale your review; spot-check them against your own judgment.
Watch the transcripts. Aggregate scores tell you that something is wrong; reading the agent’s actual tool-call sequences tells you what. Most fixes turn out to be one new heuristic or one clarified tool description.

Build it yourself

Follow these exact steps to reproduce it yourself · estimated time: ~45 minutes

Prerequisites

Python 3.10+
An Anthropic API key
Completion of the Prompting 101 lesson (recommended)

You’ll build a minimal but real agent: a codebase analyst that answers questions about any local project by listing and reading files in a loop — every concept from this lesson in ~80 lines.

Step 1 — Set up

mkdir mini-agent && cd mini-agent
python3 -m venv .venv && source .venv/bin/activate
pip install anthropic
export ANTHROPIC_API_KEY="sk-ant-..."

Step 2 — Define tools and guardrails

Create agent.py. Note: both tools are read-only — the guardrail is structural.

import sys
from pathlib import Path

import anthropic

ROOT = Path(sys.argv[1] if len(sys.argv) > 1 else ".").resolve()

TOOLS = [
    {
        "name": "list_files",
        "description": "List files under a relative directory of the project. "
        "Use this FIRST to orient yourself before reading anything.",
        "input_schema": {
            "type": "object",
            "properties": {"path": {"type": "string", "description": "Relative dir, '' for root"}},
            "required": ["path"],
        },
    },
    {
        "name": "read_file",
        "description": "Read one file's content (truncated to 8000 chars). "
        "Prefer reading few, well-chosen files over reading everything.",
        "input_schema": {
            "type": "object",
            "properties": {"path": {"type": "string"}},
            "required": ["path"],
        },
    },
]

def run_tool(name: str, args: dict) -> str:
    target = (ROOT / args["path"]).resolve()
    if not target.is_relative_to(ROOT):          # guardrail: never escape the project
        return "Error: path outside project root."
    if name == "list_files":
        if not target.is_dir():
            return f"Error: {args['path']!r} is not a directory."
        entries = [
            p.name + ("/" if p.is_dir() else "")
            for p in sorted(target.iterdir())
            if p.name not in {".git", "node_modules", ".venv", "dist"}
        ]
        return "\n".join(entries) or "(empty)"
    if name == "read_file":
        if not target.is_file():
            return f"Error: {args['path']!r} is not a file."
        return target.read_text(errors="replace")[:8000]
    return f"Error: unknown tool {name}"

Step 3 — Write the agent prompt (heuristics + budget + an out)

SYSTEM = """You are a codebase analyst agent. Answer the user's question about
the project by exploring it with your tools.

Heuristics:
- Orient first: list the root, then drill into promising directories.
- Read selectively. README, config and entry-point files usually answer
  structural questions fastest.
- Budget: simple questions should need under 8 tool calls; never exceed 15.
- If you cannot find the answer, say exactly what you looked at and what is
  missing. Never invent file contents.

When you have enough evidence, stop exploring and answer concisely, citing
file paths."""

Step 4 — The loop itself

client = anthropic.Anthropic()

def agent(question: str) -> str:
    messages = [{"role": "user", "content": question}]
    for _ in range(15):  # hard ceiling backing up the soft budget
        response = client.messages.create(
            model="claude-sonnet-4-6",
            max_tokens=2000,
            system=SYSTEM,
            tools=TOOLS,
            messages=messages,
        )
        if response.stop_reason != "tool_use":
            return "".join(b.text for b in response.content if b.type == "text")
        messages.append({"role": "assistant", "content": response.content})
        results = [
            {
                "type": "tool_result",
                "tool_use_id": block.id,
                "content": run_tool(block.name, block.input),
            }
            for block in response.content
            if block.type == "tool_use"
        ]
        for block in response.content:           # watch the transcript while you build
            if block.type == "tool_use":
                print(f"  ⚙ {block.name}({block.input})", file=sys.stderr)
        messages.append({"role": "user", "content": results})
    return "Stopped: tool-call budget exhausted."

if __name__ == "__main__":
    print(agent(input("Question about this codebase: ")))

Step 5 — Run it and watch it think

python3 agent.py /path/to/any/project
# Question: How is this project built and deployed?

Expected result: stderr shows the loop in action — list_files(''), then targeted reads of README/config files, then a cited answer. Note how it orients first: that behavior came from one heuristic line in the system prompt. Delete that line and run again to watch the quality drop — you just did your first agent-prompt ablation.

Step 6 — Evaluate like the talk says

Write 5 questions about a repo you know well, with expected answers. Run them, grade answer-correctness (right/wrong), and read the worst transcript. Fix it by adding one heuristic, not by scripting steps. That loop — eval, read transcript, refine heuristic — is agent engineering.

Where to go next

Watch the original talk for the live walkthroughs of real agent prompts.
Anthropic’s essay Building Effective Agents is the canonical written companion.
See these ideas productized in Mastering Claude Code — Claude Code is exactly this loop, polished.

Prompting for Agents: Steering Models That Act

1. What an agent is (and when you want one)

2. Prompting an agent is a different sport

Check your understanding

3. Evaluating something that takes 200 steps

Build it yourself

Step 1 — Set up

Step 2 — Define tools and guardrails

Step 3 — Write the agent prompt (heuristics + budget + an out)

Step 4 — The loop itself

Step 5 — Run it and watch it think

Step 6 — Evaluate like the talk says

Where to go next

Related lessons

Building Effective Agents: Workflows, Agents and the Patterns Between

MCP 201: How the Model Context Protocol Really Works

Prompting 101: The Anatomy of a Production-Grade Prompt

1. What an agent is (and when you want one)

2. Prompting an agent is a different sport

🧠 Check your understanding

3. Evaluating something that takes 200 steps

🛠️ Build it yourself

Step 1 — Set up

Step 2 — Define tools and guardrails

Step 3 — Write the agent prompt (heuristics + budget + an out)

Step 4 — The loop itself

Step 5 — Run it and watch it think

Step 6 — Evaluate like the talk says

Where to go next

Related lessons

Building Effective Agents: Workflows, Agents and the Patterns Between

MCP 201: How the Model Context Protocol Really Works

Prompting 101: The Anatomy of a Production-Grade Prompt

Check your understanding

Build it yourself