AI Learning
intermediate ⏱️ 15 min read · 🎬 ~46 min video

Routines, CI Autofix, and the Advisor Strategy

The biggest Claude Code platform updates from London 2026: routines that trigger on schedules and webhooks, CI that fixes its own failures, the advisor pattern for frontier-quality at lower cost, and self-hosted agent sandboxes.

This lesson is original educational writing based on this video by Anthropic (published May 19, 2026). All credit for the original content goes to the creators.

#claude-code #agentic-workflows #managed-agents
Video thumbnail: Routines, CI Autofix, and the Advisor Strategy
Original video — all credit to the creators. Watch the original on YouTube ↗

1. Routines: Claude Code runs when things happen

Until now, Claude Code has been reactive — you open a terminal, you type, Claude responds. The London 2026 keynote introduced routines, and they flip that model.

A routine is a higher-order prompt that fires on:

  • A schedule — cron-style, e.g. every night at midnight
  • A webhook — e.g. a CI system calls an endpoint when a test fails
  • An API call — e.g. your own tooling triggers an agent programmatically

The result is that Claude Code goes from a tool you reach for to a system that reacts to events in your engineering environment.

Three illustrative routines from the keynote

TriggerPrompt
Nightly cron”Review all open PRs and flag ones stuck for > 2 days”
CI webhook on failure”A test just failed — read the output and try to fix it”
Monday morning schedule”Generate a weekly engineering summary from merged PRs”

None of these require a human to press Enter. The human receives a result — a Slack message, a commit, a drafted PR comment — and decides what to do with it.

Schedulenightly / weeklyWebhookCI · deploy · alertAPI callyour toolingRoutinehigher-order promptAgent runsreads · edits · testsResultcommit · PR · message
Routines trigger model: three entry points (schedule, webhook, API) feed into a routine prompt, which launches an agent that produces an observable result.

2. CI Autofix: letting Claude babysit your pull requests

The second major announcement directly addresses one of the most frustrating parts of shipping software: the CI red-green-red-green loop.

CI Autofix connects Claude Code to your pull request pipeline. Here is the sequence:

  1. You push a branch and open a PR
  2. CI runs — a test fails
  3. Claude Code reads the failure output
  4. It attempts a fix, commits, and pushes
  5. CI runs again — if it fails again, Claude loops: read → fix → push
  6. When CI is green, the PR sits ready for human review

Merge conflicts get the same treatment. Claude reads the conflicting diff, resolves it, and pushes the resolved commit.

What changes about the developer’s job

Before CI Autofix, a developer’s workflow looked like:

push → CI fails → read log → fix → push → CI fails again → …

After CI Autofix, it looks like:

push → review approved PR

The human is no longer a fixer of build breaks. They become a reviewer of completed work. That is a meaningful level-up in leverage, especially on a team where CI flakiness has historically been a constant drain.

3. Agent View in the CLI

A smaller but practically important feature: Agent View.

When you run multiple parallel agents — say, five separate Claude Code sessions working on different tasks simultaneously — the new agent view in the CLI gives you a single pane that shows:

  • All running agents and their current status
  • What each agent is doing at this moment
  • A way to switch focus to any agent interactively

Previously, “run many agents in parallel” was advice that sounded good but was awkward in practice. You had to juggle multiple terminal windows or tmux panes, with no easy way to get an overview. Agent View makes the workflow feel native.

4. The Advisor Strategy: frontier quality at lower cost

This is the most architecturally interesting announcement in the keynote, and the one with the clearest ROI story.

The core idea

Most steps in an agentic loop are routine:

  • Read a file
  • Run a test command
  • Check whether output matches a pattern
  • Write a small edit

A handful of steps require genuine deep reasoning:

  • Deciding how to approach an ambiguous spec
  • Diagnosing a subtle cross-cutting bug
  • Designing a schema that will be hard to change later

The advisor strategy assigns models to steps by what those steps actually require.

Executor model (small, fast, cheap): runs the main loop — routine file reads, shell commands, incremental edits.

Advisor model (large, slow, expensive): consulted on hard decisions only.

Executor ModelSonnet — cheap, fast, agentic loopRoutine stepread · run · editHard decision?ambiguous · architecturalAdvisor ModelOpus — expensive, infrequentResult returned to executorloop continues with advisor’s answer
Advisor strategy architecture: a small executor runs the agentic loop and calls out to a large advisor only when it encounters a step requiring deep reasoning. Most steps never reach the advisor.

The keynote cited Eve Legal as a production case study:

  • Before: Opus for every step → frontier quality, high cost
  • After: Sonnet executor + Opus advisor → frontier quality, 5× lower cost per task

The 5× number is significant. It means you can run five times as many agent tasks for the same budget, or equivalently, make agentic automation economically viable for tasks that were borderline before.

When to reach for this pattern

  • Tasks with a mix of clearly routine steps and a few genuinely ambiguous decisions
  • High-volume agentic workloads where cost is a real constraint
  • Teams that want frontier-quality reasoning on critical decisions without paying for it on every ls the agent runs

5. Self-Hosted Sandboxes

Claude Managed Agents now lets you bring your own compute environment. Instead of running agent code on Anthropic’s infrastructure, you point Claude at a sandbox provider you control.

Supported out of the box:

  • Daytona
  • Cloudflare
  • Vercel
  • Modal

How it works

The agent’s reasoning still happens on Anthropic’s infrastructure. What changes is where shell commands and file operations execute. When you configure a self-hosted sandbox provider, the Managed Agents runtime routes bash and file I/O tool calls to your chosen environment instead of a shared Anthropic-managed VM.

The configuration lives in your agent definition:

{
  "sandbox": {
    "provider": "modal",
    "image": "my-org/agent-sandbox:latest",
    "env": ["DATABASE_URL", "INTERNAL_API_KEY"]
  }
}

Your image can pre-install your tools, contain your private certificates, and mount persistent volumes — things that aren’t possible in a shared sandbox.

Why this matters

Security and compliance: Some organizations cannot allow code execution to happen outside their own infrastructure. With self-hosted sandboxes, the agent still reasons in the cloud, but every shell command, file write, and test run happens inside your environment.

Sensitive data: If your agent needs to touch a production database, internal credentials, or regulated data, you likely cannot route that through a third-party compute layer. Self-hosted sandboxes let you keep data local.

Cost control: Cloud compute costs for long-running agentic tasks can add up. Using providers like Modal gives you direct visibility and control over the bill.

6. MCP Tunnels: reaching inside your private network

The final major platform feature addresses a hard limitation of cloud-based agents: they cannot reach anything behind a firewall.

MCP tunnels solve this by creating a secure tunnel between a cloud agent and an MCP server running inside your private network. The agent lives in the cloud; the MCP server lives on your VPN, your Kubernetes cluster, or your laptop.

How the tunnel works

You run a standard MCP server locally (any language, any transport). You then start the tunnel client, which establishes an outbound-only connection to Anthropic’s tunnel relay — no inbound firewall rules required:

# On your internal machine / in your k8s pod
npx @anthropic/mcp-tunnel start \
  --server "python -m my_internal_mcp_server" \
  --name "internal-data-warehouse"

The relay assigns a tunnel ID. You wire that ID into your agent’s MCP configuration:

{
  "mcpServers": [
    { "tunnel": "internal-data-warehouse" }
  ]
}

The cloud agent calls MCP tools on this server as if it were local. Data flows through the tunnel and never leaves your network boundary — the relay only forwards encrypted RPC frames, not raw data.

What this unlocks

  • Agents that query internal databases (not exposed to the internet)
  • Agents that call internal APIs (behind authentication, behind a VPN)
  • Agents that interact with tools running in air-gapped environments

The Counter Growth Agent demo

The keynote showed a “Counter Growth Agent” that used MCP tunnels to connect to:

  1. Slack — for communication and approval flows
  2. A private data warehouse — for reading experiment results
  3. A feature flag system — for making changes

The workflow: Claude proposes a feature flag change in Slack. A human approves it. Claude applies the change — all through MCP tools, all without any data leaving the organization’s network.

This is the pattern that makes the “human as reviewer” model practical at enterprise scale: the agent has enough reach to do real work, but every significant action still gets a human sign-off.

Check your understanding

5 questions · your answers are saved in this browser only

  1. 1. What are the three trigger types that can fire a routine?

  2. 2. In the advisor strategy, which model runs the main agentic loop?

  3. 3. Eve Legal reported a 5× cost reduction using the advisor strategy. What did they pair?

  4. 4. What problem do MCP tunnels solve?

  5. 5. Which of these best describes the new human role in a CI Autofix workflow?

Build it yourself

Follow these exact steps to reproduce it yourself · estimated time: ~30 minutes

Prerequisites

  • Claude Code installed (npm install -g @anthropic-ai/claude-code)
  • A GitHub repository with a CI workflow (GitHub Actions)
  • Anthropic API key or Claude Pro/Max subscription
  • The gh CLI installed and authenticated

Step 1 — Verify your CI workflow has machine-readable output

CI Autofix works best when test failure output is clean and actionable. Open your workflow file and confirm your test step prints failure details to stdout:

- name: Run tests
  run: npm test 2>&1

The 2>&1 redirect ensures stderr (where most test frameworks print errors) ends up in the CI log that Claude will read.

Step 2 — Add a CLAUDE.md with test instructions

Claude needs to know how to run your tests locally to verify a fix before pushing. Create or update CLAUDE.md:

## CI / Testing
- Run tests: `npm test`
- Lint: `npm run lint`
- Tests must pass before any commit to main or a PR branch
- Do not modify test files when fixing a CI failure — fix the source

The last rule is important. Without it, Claude may delete a failing assertion instead of fixing the underlying code.

Step 3 — Create the CI autofix routine

Create a file at .claude/routines/ci-autofix.md:

# CI Autofix Routine

You are responding to a CI failure on a pull request in this repository.

1. Read the failure output provided in the webhook payload
2. Identify the root cause — look at the test name, the error message, and the
   relevant source files
3. Propose a fix. Before applying it, explain your reasoning in one paragraph
4. Apply the fix, run `npm test` locally to verify, then commit with message:
   `fix: ci autofix — <one-line description>`
5. Push to the PR branch
6. If tests still fail after your fix, try once more with a different approach
7. If you cannot fix it after two attempts, post a PR comment explaining what
   you tried and what you believe the root cause is

Step 4 — Wire the routine to your CI webhook

Add a GitHub Actions step that calls the routine when tests fail:

name: CI Autofix
on:
  workflow_run:
    workflows: ["CI"]
    types: [completed]

jobs:
  autofix:
    if: ${{ github.event.workflow_run.conclusion == 'failure' }}
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Trigger Claude autofix
        run: |
          claude -p "$(cat .claude/routines/ci-autofix.md)" \
            --context "CI failure on PR ${{ github.event.workflow_run.head_branch }}" \
            --allowedTools "Bash(npm test),Bash(npm run lint),Edit,Bash(git commit:*),Bash(git push:*)"
        env:
          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}

Step 5 — Test the routine with a known-bad commit

Create a branch, intentionally break a simple test, and push:

git checkout -b test/ci-autofix-demo
# Edit a source file to break one test
git commit -am "chore: intentionally break test for autofix demo"
git push -u origin test/ci-autofix-demo
gh pr create --title "CI Autofix demo" --body "Testing the autofix routine"

Watch the Actions tab. After the CI failure job completes, the autofix workflow should trigger, and within a few minutes you should see a new commit from the routine that restores green CI.

Step 6 — Review and refine the CLAUDE.md rules

After the first real autofix, open the commit and review what Claude changed. If it took an approach you wouldn’t have chosen, add a rule to CLAUDE.md:

## Autofix constraints
- Prefer modifying the implementation over restructuring the test
- Do not change function signatures without noting it in the PR comment
- If the fix touches more than 3 files, post a comment and wait for human review

Expected result: CI failures on your PRs are resolved autonomously for the common case (a logic error, a missing edge case, a wrong constant). Edge cases that require architectural judgment will get a clear comment explaining what Claude tried, giving you a head start on the fix.

Where to go next

  • Watch the full keynote — the live Counter Growth Agent demo is the clearest illustration of MCP tunnels in action.
  • Read the Claude Code docs on routines for the current configuration format.
  • If the advisor strategy interests you, see the model pricing page for the executor/advisor cost math on your specific task volume.
  • Continue with Mastering Claude Code if you want to understand the underlying agentic loop that all of these features build on top of.

Related lessons

intermediate 🎬 Anthropic · ~30 min

Fable 5 and the AI-Native Company

What Fable 5's capabilities unlock, how dynamic workflows reshape engineering at scale, and what it looks like when a company runs on an AI substrate.

#best-practices #agentic-workflows #claude-code
intermediate 🎬 Anthropic · ~15 min

Giving Agents Their Own Computers

How Cursor gave cloud agents onboarding, dev environments, and the ability to self-report problems — and what the 'agent experience' means for teams shipping parallel agents at scale.

#agentic-workflows #managed-agents