Agent Harness Engineering: Chasing Friction
AirOps's hard-won lessons from shipping Claude agents to non-technical enterprise users: intentional scoping, specialized tools over primitive exploration, and sub-agents for context isolation.
25 interactive classes built from the official Claude YouTube channel.
AirOps's hard-won lessons from shipping Claude agents to non-technical enterprise users: intentional scoping, specialized tools over primitive exploration, and sub-agents for context isolation.
Sessions are isolated by default β agents forget everything when they close. This lesson shows how to wire persistent memory onto your agents and use Dreaming to consolidate and improve what they remember over time.
Fiona Fung, Head of Engineering for Claude Code, shares five lessons from rewriting her team's norms when AI changed where the bottlenecks are β from planning and code review to hiring, onboarding, and org shape.
What Fable 5's capabilities unlock, how dynamic workflows reshape engineering at scale, and what it looks like when a company runs on an AI substrate.
Without evals you're flying blind β reactive to complaints, unable to verify improvements. This lesson shows how to build code and model graders, run QA loops, and turn subjective quality into something you can act on.
How Claude Code's context window works, when to compact vs clear, and practical strategies for keeping sessions lean and productive.
How Replit built VibeBench and the Telescope continuous improvement system to turn overnight eval runs into shipped model upgrades β without a human in the loop.
How Cursor gave cloud agents onboarding, dev environments, and the ability to self-report problems β and what the 'agent experience' means for teams shipping parallel agents at scale.
Use Claude Code's lifecycle hooks to run formatters, block dangerous operations, and enforce team conventions β every time, without relying on Claude to remember.
Anthropic's Applied AI team shares three practices that get more out of longer-running agents: letting Claude interview you instead of writing specs yourself, using HTML over Markdown for richer specs, and embedding verification directly into your artifacts so agents can validate their own work.
Design production memory systems for multi-agent architectures using filesystem-based memory stores, optimistic concurrency, and the dreaming feedback loop.
The biggest Claude Code platform updates from London 2026: routines that trigger on schedules and webhooks, CI that fixes its own failures, the advisor pattern for frontier-quality at lower cost, and self-hosted agent sandboxes.
Claude Managed Agents is the fastest path from prototype to production-ready agent. This lesson walks through the three core primitives β Agent (brain), Environment (hands), Session (the binding) β and shows how to wire them into a working incident-response agent.
Two battle-tested playbooks for prompting work: maintaining and migrating existing prompts, and building agentic loops from scratch using evals to drive every decision.
Master effort levels and adaptive thinking to get the best intelligence-speed-cost trade-off from Claude on any task.
The decision framework for knowing when agent logic belongs in a tool, a skill, or a subagent β illustrated through a live decomposition of a 400-line inventory agent.
How Elicit built AshPL β a Turing-incomplete, purely functional DSL β to make their AI research assistant legible, auditable, and faithfully executable.
Boris Cherny and Cat Wu reflect on Claude Code's first year β what changed about verification, why auto mode beat plan mode, how routines became the killer feature, and where engineering orgs are heading.
Anthropic's foundational essay distilled into a class: the five workflow patterns, what truly counts as an agent, why simplicity wins, and how to design tools your agent can actually use.
Cal Rueb's field-tested playbook for getting consistently great results from Claude Code: context curation, permission strategy, planning, parallel sessions and knowing when to course-correct.
Beyond the hello-world server: why MCP exists, its clientβserver architecture, the three primitives and who controls them, transports, and where the protocol is heading.
How Claude Code works under the hood, the workflows Anthropic's own engineers use, and how to extend it with memory files, slash commands and headless automation.
Anthropic's Applied AI team shows how to evolve a one-line prompt into a reliable, production-quality prompt β structure, XML tags, examples, giving the model an out, and prefills.
Agents are models using tools in a loop. This lesson covers when to build one, how to prompt it β heuristics, budgets, guardrails β and how to evaluate something that takes hundreds of steps.
Erik Schluntz merged a 22,000-line largely Claude-written change into a production RL codebase. This lesson extracts the discipline that makes that safe: you stop being the code writer and become the system designer and verifier.