Routines, CI Autofix, and the Advisor Strategy

1. Routines: Claude Code runs when things happen

Until now, Claude Code has been reactive — you open a terminal, you type, Claude responds. The London 2026 keynote introduced routines, and they flip that model.

A routine is a higher-order prompt that fires on:

A schedule — cron-style, e.g. every night at midnight
A webhook — e.g. a CI system calls an endpoint when a test fails
An API call — e.g. your own tooling triggers an agent programmatically

The result is that Claude Code goes from a tool you reach for to a system that reacts to events in your engineering environment.

Three illustrative routines from the keynote

Trigger	Prompt
Nightly cron	”Review all open PRs and flag ones stuck for > 2 days”
CI webhook on failure	”A test just failed — read the output and try to fix it”
Monday morning schedule	”Generate a weekly engineering summary from merged PRs”

None of these require a human to press Enter. The human receives a result — a Slack message, a commit, a drafted PR comment — and decides what to do with it.

Routines trigger model: three entry points (schedule, webhook, API) feed into a routine prompt, which launches an agent that produces an observable result.

2. CI Autofix: letting Claude babysit your pull requests

The second major announcement directly addresses one of the most frustrating parts of shipping software: the CI red-green-red-green loop.

CI Autofix connects Claude Code to your pull request pipeline. Here is the sequence:

You push a branch and open a PR
CI runs — a test fails
Claude Code reads the failure output
It attempts a fix, commits, and pushes
CI runs again — if it fails again, Claude loops: read → fix → push
When CI is green, the PR sits ready for human review

Merge conflicts get the same treatment. Claude reads the conflicting diff, resolves it, and pushes the resolved commit.

What changes about the developer’s job

Before CI Autofix, a developer’s workflow looked like:

push → CI fails → read log → fix → push → CI fails again → …

After CI Autofix, it looks like:

push → review approved PR

The human is no longer a fixer of build breaks. They become a reviewer of completed work. That is a meaningful level-up in leverage, especially on a team where CI flakiness has historically been a constant drain.

3. Agent View in the CLI

A smaller but practically important feature: Agent View.

When you run multiple parallel agents — say, five separate Claude Code sessions working on different tasks simultaneously — the new agent view in the CLI gives you a single pane that shows:

All running agents and their current status
What each agent is doing at this moment
A way to switch focus to any agent interactively

Previously, “run many agents in parallel” was advice that sounded good but was awkward in practice. You had to juggle multiple terminal windows or tmux panes, with no easy way to get an overview. Agent View makes the workflow feel native.

4. The Advisor Strategy: frontier quality at lower cost

This is the most architecturally interesting announcement in the keynote, and the one with the clearest ROI story.

The core idea

Most steps in an agentic loop are routine:

Read a file
Run a test command
Check whether output matches a pattern
Write a small edit

A handful of steps require genuine deep reasoning:

Deciding how to approach an ambiguous spec
Diagnosing a subtle cross-cutting bug
Designing a schema that will be hard to change later

The advisor strategy assigns models to steps by what those steps actually require.

Executor model (small, fast, cheap): runs the main loop — routine file reads, shell commands, incremental edits.

Advisor model (large, slow, expensive): consulted on hard decisions only.

Advisor strategy architecture: a small executor runs the agentic loop and calls out to a large advisor only when it encounters a step requiring deep reasoning. Most steps never reach the advisor.

Real-world result: Eve Legal

The keynote cited Eve Legal as a production case study:

Before: Opus for every step → frontier quality, high cost
After: Sonnet executor + Opus advisor → frontier quality, 5× lower cost per task

The 5× number is significant. It means you can run five times as many agent tasks for the same budget, or equivalently, make agentic automation economically viable for tasks that were borderline before.

When to reach for this pattern

Tasks with a mix of clearly routine steps and a few genuinely ambiguous decisions
High-volume agentic workloads where cost is a real constraint
Teams that want frontier-quality reasoning on critical decisions without paying for it on every ls the agent runs

5. Self-Hosted Sandboxes

Claude Managed Agents now lets you bring your own compute environment. Instead of running agent code on Anthropic’s infrastructure, you point Claude at a sandbox provider you control.

Supported out of the box:

Daytona
Cloudflare
Vercel
Modal

How it works

The agent’s reasoning still happens on Anthropic’s infrastructure. What changes is where shell commands and file operations execute. When you configure a self-hosted sandbox provider, the Managed Agents runtime routes bash and file I/O tool calls to your chosen environment instead of a shared Anthropic-managed VM.

The configuration lives in your agent definition:

{
  "sandbox": {
    "provider": "modal",
    "image": "my-org/agent-sandbox:latest",
    "env": ["DATABASE_URL", "INTERNAL_API_KEY"]
  }
}

Your image can pre-install your tools, contain your private certificates, and mount persistent volumes — things that aren’t possible in a shared sandbox.

Why this matters

Security and compliance: Some organizations cannot allow code execution to happen outside their own infrastructure. With self-hosted sandboxes, the agent still reasons in the cloud, but every shell command, file write, and test run happens inside your environment.

Sensitive data: If your agent needs to touch a production database, internal credentials, or regulated data, you likely cannot route that through a third-party compute layer. Self-hosted sandboxes let you keep data local.

Cost control: Cloud compute costs for long-running agentic tasks can add up. Using providers like Modal gives you direct visibility and control over the bill.

6. MCP Tunnels: reaching inside your private network

The final major platform feature addresses a hard limitation of cloud-based agents: they cannot reach anything behind a firewall.

MCP tunnels solve this by creating a secure tunnel between a cloud agent and an MCP server running inside your private network. The agent lives in the cloud; the MCP server lives on your VPN, your Kubernetes cluster, or your laptop.

How the tunnel works

You run a standard MCP server locally (any language, any transport). You then start the tunnel client, which establishes an outbound-only connection to Anthropic’s tunnel relay — no inbound firewall rules required:

# On your internal machine / in your k8s pod
npx @anthropic/mcp-tunnel start \
  --server "python -m my_internal_mcp_server" \
  --name "internal-data-warehouse"

The relay assigns a tunnel ID. You wire that ID into your agent’s MCP configuration:

{
  "mcpServers": [
    { "tunnel": "internal-data-warehouse" }
  ]
}

The cloud agent calls MCP tools on this server as if it were local. Data flows through the tunnel and never leaves your network boundary — the relay only forwards encrypted RPC frames, not raw data.

What this unlocks

Agents that query internal databases (not exposed to the internet)
Agents that call internal APIs (behind authentication, behind a VPN)
Agents that interact with tools running in air-gapped environments

The Counter Growth Agent demo

The keynote showed a “Counter Growth Agent” that used MCP tunnels to connect to:

Slack — for communication and approval flows
A private data warehouse — for reading experiment results
A feature flag system — for making changes

The workflow: Claude proposes a feature flag change in Slack. A human approves it. Claude applies the change — all through MCP tools, all without any data leaving the organization’s network.

This is the pattern that makes the “human as reviewer” model practical at enterprise scale: the agent has enough reach to do real work, but every significant action still gets a human sign-off.

Check your understanding

5 questions · your answers are saved in this browser only

1. What are the three trigger types that can fire a routine?

Routines fire on a schedule (cron-style), a webhook (e.g. from CI), or a direct API call from your tooling. All three remove the requirement for a human to manually invoke Claude Code.
2. In the advisor strategy, which model runs the main agentic loop?

The small, fast executor model runs the loop continuously. The large advisor model is consulted only for genuinely hard decisions, keeping it infrequent and therefore cost-effective.
3. Eve Legal reported a 5× cost reduction using the advisor strategy. What did they pair?

Eve Legal used Sonnet as the executor (cheap, fast) and Opus as the advisor (called only for hard decisions). This pairing delivered frontier-quality output at 5× lower cost per task.
4. What problem do MCP tunnels solve?

Cloud agents cannot reach services behind a firewall. MCP tunnels create a secure bridge so the cloud-resident agent can call MCP servers that live inside a private VPN, Kubernetes cluster, or local network.
5. Which of these best describes the new human role in a CI Autofix workflow?

CI Autofix handles the read-fail-fix-push loop autonomously. The developer's job shifts from fixing broken builds to reviewing finished pull requests — a meaningful increase in leverage.

Build it yourself

Follow these exact steps to reproduce it yourself · estimated time: ~30 minutes

Prerequisites

Claude Code installed (npm install -g @anthropic-ai/claude-code)
A GitHub repository with a CI workflow (GitHub Actions)
Anthropic API key or Claude Pro/Max subscription
The gh CLI installed and authenticated

Step 1 — Verify your CI workflow has machine-readable output

CI Autofix works best when test failure output is clean and actionable. Open your workflow file and confirm your test step prints failure details to stdout:

- name: Run tests
  run: npm test 2>&1

The 2>&1 redirect ensures stderr (where most test frameworks print errors) ends up in the CI log that Claude will read.

Step 2 — Add a CLAUDE.md with test instructions

Claude needs to know how to run your tests locally to verify a fix before pushing. Create or update CLAUDE.md:

## CI / Testing
- Run tests: `npm test`
- Lint: `npm run lint`
- Tests must pass before any commit to main or a PR branch
- Do not modify test files when fixing a CI failure — fix the source

The last rule is important. Without it, Claude may delete a failing assertion instead of fixing the underlying code.

Step 3 — Create the CI autofix routine

Create a file at .claude/routines/ci-autofix.md:

# CI Autofix Routine

You are responding to a CI failure on a pull request in this repository.

1. Read the failure output provided in the webhook payload
2. Identify the root cause — look at the test name, the error message, and the
   relevant source files
3. Propose a fix. Before applying it, explain your reasoning in one paragraph
4. Apply the fix, run `npm test` locally to verify, then commit with message:
   `fix: ci autofix — <one-line description>`
5. Push to the PR branch
6. If tests still fail after your fix, try once more with a different approach
7. If you cannot fix it after two attempts, post a PR comment explaining what
   you tried and what you believe the root cause is

Step 4 — Wire the routine to your CI webhook

Add a GitHub Actions step that calls the routine when tests fail:

name: CI Autofix
on:
  workflow_run:
    workflows: ["CI"]
    types: [completed]

jobs:
  autofix:
    if: ${{ github.event.workflow_run.conclusion == 'failure' }}
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Trigger Claude autofix
        run: |
          claude -p "$(cat .claude/routines/ci-autofix.md)" \
            --context "CI failure on PR ${{ github.event.workflow_run.head_branch }}" \
            --allowedTools "Bash(npm test),Bash(npm run lint),Edit,Bash(git commit:*),Bash(git push:*)"
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}

Step 5 — Test the routine with a known-bad commit

Create a branch, intentionally break a simple test, and push:

git checkout -b test/ci-autofix-demo
# Edit a source file to break one test
git commit -am "chore: intentionally break test for autofix demo"
git push -u origin test/ci-autofix-demo
gh pr create --title "CI Autofix demo" --body "Testing the autofix routine"

Watch the Actions tab. After the CI failure job completes, the autofix workflow should trigger, and within a few minutes you should see a new commit from the routine that restores green CI.

Step 6 — Review and refine the CLAUDE.md rules

After the first real autofix, open the commit and review what Claude changed. If it took an approach you wouldn’t have chosen, add a rule to CLAUDE.md:

## Autofix constraints
- Prefer modifying the implementation over restructuring the test
- Do not change function signatures without noting it in the PR comment
- If the fix touches more than 3 files, post a comment and wait for human review

Expected result: CI failures on your PRs are resolved autonomously for the common case (a logic error, a missing edge case, a wrong constant). Edge cases that require architectural judgment will get a clear comment explaining what Claude tried, giving you a head start on the fix.

Where to go next

Watch the full keynote — the live Counter Growth Agent demo is the clearest illustration of MCP tunnels in action.
Read the Claude Code docs on routines for the current configuration format.
If the advisor strategy interests you, see the model pricing page for the executor/advisor cost math on your specific task volume.
Continue with Mastering Claude Code if you want to understand the underlying agentic loop that all of these features build on top of.

Routines, CI Autofix, and the Advisor Strategy

1. Routines: Claude Code runs when things happen

Three illustrative routines from the keynote

2. CI Autofix: letting Claude babysit your pull requests

What changes about the developer’s job

3. Agent View in the CLI

4. The Advisor Strategy: frontier quality at lower cost

The core idea

Real-world result: Eve Legal

When to reach for this pattern

5. Self-Hosted Sandboxes

How it works

Why this matters

6. MCP Tunnels: reaching inside your private network

How the tunnel works

What this unlocks

The Counter Growth Agent demo

Check your understanding

Build it yourself

Step 1 — Verify your CI workflow has machine-readable output

Step 2 — Add a CLAUDE.md with test instructions

Step 3 — Create the CI autofix routine

Step 4 — Wire the routine to your CI webhook

Step 5 — Test the routine with a known-bad commit

Step 6 — Review and refine the CLAUDE.md rules

Where to go next

Related lessons

Running an AI-Native Engineering Org: What Changes When Coding Isn't the Bottleneck

Fable 5 and the AI-Native Company

Giving Agents Their Own Computers

1. Routines: Claude Code runs when things happen

Three illustrative routines from the keynote

2. CI Autofix: letting Claude babysit your pull requests

What changes about the developer’s job

3. Agent View in the CLI

4. The Advisor Strategy: frontier quality at lower cost

The core idea

Real-world result: Eve Legal

When to reach for this pattern

5. Self-Hosted Sandboxes

How it works

Why this matters

6. MCP Tunnels: reaching inside your private network

How the tunnel works

What this unlocks

The Counter Growth Agent demo

🧠 Check your understanding

🛠️ Build it yourself

Step 1 — Verify your CI workflow has machine-readable output

Step 2 — Add a CLAUDE.md with test instructions

Step 3 — Create the CI autofix routine

Step 4 — Wire the routine to your CI webhook

Step 5 — Test the routine with a known-bad commit

Step 6 — Review and refine the CLAUDE.md rules

Where to go next

Related lessons

Running an AI-Native Engineering Org: What Changes When Coding Isn't the Bottleneck

Fable 5 and the AI-Native Company

Giving Agents Their Own Computers

Check your understanding

Build it yourself