Memory and Dreaming: Building Self-Improving Agents

The Isolation Problem

Every time an agent starts a session without memory, it starts from the same blank slate. Performance on each new task mirrors the last — there’s no learning curve, just repetition. Agents make the same mistakes independently, display the same inefficiencies, and duplicate effort that other agents have already done.

The goal is different: performance should improve from task to task, and from agent to agent. Memory is the mechanism that makes this possible. It lets agents carry forward learnings, avoid known pitfalls, and build on a shared understanding of the organisation they work in.

Without memory, each agent starts fresh. With memory, learnings compound across sessions and agents.

Why a File System?

Earlier memory implementations focused on capability in the harness — custom tools, CLAUDE.md files, SDK-level memory primitives. These worked but required careful engineering to keep in sync.

The shift in Anthropic’s managed-agent memory is simpler: model memory as a file system. Claude already excels at navigating virtual environments, using bash and grep, reading, updating, and organising files. Rather than building a bespoke memory interface, the design leans into what Claude already does well. The memory store mounts as a filesystem the agent can read and write freely.

This “get out of Claude’s way” principle also applied to skills. A flexible, minimal format turned out to create endless possibilities precisely because the model already understood how to work with it.

Multi-Agent Memory Architecture

Single-agent memory is straightforward. Multi-agent memory introduces new requirements:

Multiple sessions reading and writing the same store simultaneously
Different scopes: org-wide knowledge vs. task-specific state
Write conflicts when two agents update the same file concurrently
Enterprise controls: version history, attribution, audit trails

The solution is a layered store hierarchy:

Scope	Access	Content
Organisation-wide	Read-only (for agents)	SLO policies, runbooks, on-call mappings — stable reference material
Task-specific	Read-write	Findings, decisions, fix status, in-flight state

To prevent one agent from clobbering another’s writes, Anthropic uses optimistic concurrency control: an agent reads the current version, makes its update, and commits with the expected version. If a conflicting write happened in between, the commit fails and the agent retries — no locks, no blocking, good throughput under concurrent writes.

Dreaming: The Feedback Loop

Agents writing to memory as they work is locally optimal — like taking notes while doing a task. But scaled across many sessions, locally optimal becomes globally fragmented: agents independently learn from the same mistakes, duplicate findings, create overlapping entries.

Dreaming is Anthropic’s answer. It’s a batch process, completely decoupled from the agent loop, that:

Reads session transcripts from past runs
Inspects the current state of memory
Proposes curated, consolidated updates
Produces a verified new memory snapshot the next agents can adopt

Dreaming runs out-of-band: agents write memory in real time; dreaming refines it asynchronously.

Why out-of-band matters

Three benefits come from dreaming’s decoupled architecture:

Cross-session pattern detection. A single agent can only see its own history. Dreaming analyzes transcripts across all agents and sessions, which is where recurring mistakes and systemic inefficiencies become visible. In the SRE demo, dreaming discovered that a CPU spike was always followed 60 seconds later by an alert — a pattern no individual agent session could have noticed.
No objective conflict. An agent running in production must balance improving its memory quality against completing its actual task. Dreaming runs independently, so it can focus entirely on memory quality without trading off against task performance.
Zero latency added. Dreaming is completely off the hot path. It can run nightly, hourly, ad hoc, or triggered by events like end-of-session — all via API.

Production Patterns

Scheduling. Dreaming can be triggered via API on any cadence. Common patterns: nightly consolidation, end-of-sprint review, or event-triggered after a significant incident closes.

Memory API. Memory has a standalone CRUD API, so teams can manage it from anywhere — not just from within agents. This includes exports (for audits), redactions (for compliance), and diffs between versions.

Enterprise controls. Every write is version-controlled with attribution: which session wrote which part. Teams can inspect how memory evolved over time and roll back bad updates — critical for production trust.

Harvey’s result. With dreaming enabled on their legal benchmark, Harvey saw a 6× increase in agent completion rates. The cause: agents were independently learning from the same failures, each storing a fragmented lesson. Dreaming consolidated those into shared, high-quality guidance that every subsequent agent benefited from.

Check your understanding

4 questions · your answers are saved in this browser only

1. Why does Anthropic model memory as a file system rather than as a custom database?

The principle is "get out of Claude's way." Rather than building a novel interface, Anthropic chose a representation the model already handles expertly. Opus 4.7 is state-of-the-art at filesystem-based memory precisely because these are well-practised skills.
2. What is the purpose of the organisation-wide read-only memory store in a multi-agent system?

The layered store hierarchy separates stable org-wide context (read-only) from dynamic per-task state (read-write). This prevents agents from accidentally modifying shared reference material while still benefiting from it.
3. Why is dreaming's out-of-band, decoupled design important?

Three benefits: dreaming can see patterns across many agents' histories (unlike an individual agent), it has no objective conflict with task completion, and it adds zero latency to the agent's hot path.
4. What mechanism prevents two agents from overwriting each other's memory writes simultaneously?

Optimistic concurrency control avoids blocking and scales well under concurrent writes. Agents proceed without locks, and conflicts (rare in practice) trigger a retry rather than a queue.

Build it yourself

Follow these exact steps to reproduce it yourself

Try it yourself: persistent memory with the Managed Agents API

Provision a memory store via the Claude platform console or API. Start with a single read-write store.
Attach it to a session. Pass the memory store ID when creating a managed agent session. The store mounts as a virtual filesystem the agent can read and write using standard file tools.
Run two sessions sequentially on related tasks. In the second session, add to the system prompt: “Before starting, review the memory store for relevant prior findings.” Check whether session 2 references session 1’s work.
Trigger a dream. After three or more sessions, kick off dreaming via the API (or the console Dreams tab). Inspect the diff in the memory store — look for consolidation, deduplication, and new cross-session insights.
Add a read-only org store. Create a second store, populate it with a reference document (e.g. a coding style guide or an on-call runbook), attach it read-only to your sessions. Confirm agents reference it without modifying it.

Memory and Dreaming: Building Self-Improving Agents

The Isolation Problem

Why a File System?

Multi-Agent Memory Architecture

Dreaming: The Feedback Loop

Why out-of-band matters

Production Patterns

Check your understanding

Build it yourself

Try it yourself: persistent memory with the Managed Agents API

Related lessons

Agents That Remember: Memory Stores and Dreaming in Claude Managed Agents

Giving Agents Their Own Computers

Routines, CI Autofix, and the Advisor Strategy

The Isolation Problem

Why a File System?

Multi-Agent Memory Architecture

Dreaming: The Feedback Loop

Why out-of-band matters

Production Patterns

🧠 Check your understanding

🛠️ Build it yourself

Try it yourself: persistent memory with the Managed Agents API

Related lessons

Agents That Remember: Memory Stores and Dreaming in Claude Managed Agents

Giving Agents Their Own Computers

Routines, CI Autofix, and the Advisor Strategy

Check your understanding

Build it yourself