AI Learning
advanced ⏱️ 9 min read · 🎬 ~30 min video

MCP 201: How the Model Context Protocol Really Works

Beyond the hello-world server: why MCP exists, its client–server architecture, the three primitives and who controls them, transports, and where the protocol is heading.

This lesson is original educational writing based on this video by Anthropic (published May 22, 2025). All credit for the original content goes to the creators.

#mcp #integrations #agents
Video thumbnail: MCP 201: How the Model Context Protocol Really Works
Original video — all credit to the creators. Watch the original on YouTube ↗

1. Why a protocol?

Before MCP (announced by Anthropic in November 2024, donated as an open standard), every AI application that wanted to talk to every external system needed a custom integration: M applications × N systems integrations, each one bespoke. MCP, presented here by protocol co-creator David Soria Parra, collapses that to M + N: every app implements the protocol once as a client, every system exposes itself once as a server, and any client can use any server. It’s deliberately analogous to USB-C — one connector, arbitrary devices — or to what the Language Server Protocol did for editors and languages.

The “201” insight is why a protocol rather than a library: a protocol is language-agnostic, versioned, and lets both sides evolve independently. Your CRM team can ship an MCP server without knowing which AI app will consume it; Claude can connect to it without Anthropic writing a single CRM-specific line.

2. Architecture: host, client, server

HOST (Claude Code, Claude.ai, your app)owns the model loop & permissionsMCP client #1MCP client #2MCP client #3GitHub serverstdio (local process)Database serverstdio (local process)SaaS serverstreamable HTTP (remote)1 : 1JSON-RPC 2.0 messages · capability negotiation at initialize
MCP architecture: a host application runs one client per server connection. Servers expose tools, resources and prompts; clients can offer sampling and roots back to servers.

Terminology that’s easy to mix up:

  • The host is the application the user actually runs (Claude Code, Claude Desktop, your own product). It owns the model, the conversation and the permission UX.
  • The host creates one client per connection — a 1:1 link to one server. Three servers, three client instances.
  • A server is a (usually small) program that exposes some capability: GitHub, a database, your file system, your company’s internal API.

Under the hood it’s JSON-RPC 2.0 messages over a transport. The connection starts with an initialize handshake where both sides declare their capabilities — a server might offer tools but no resources; a client might support sampling or not. Everything after that is negotiated, which is how the protocol evolves without breaking older implementations. Servers can also push notifications (e.g. “my tool list changed”) instead of waiting to be polled.

Transports

  • stdio — the host launches the server as a local subprocess and speaks over stdin/stdout. Zero network setup; perfect for local tools and personal use.
  • Streamable HTTP — the server runs anywhere on the network and can stream via server-sent events. This is the deployment story for remote/shared servers, with OAuth for authorization. (It replaced the older HTTP+SSE transport in the 2025 spec revisions.)

3. The three primitives — and who controls them

The most useful framing in the talk: each server primitive has a different controller.

PrimitiveControlled byMental model
ToolsThe model”Things Claude may decide to do” — search tickets, run a query, post a message
ResourcesThe application”Context the host can attach” — files, schemas, documents, identified by URI
PromptsThe user”Recipes the user invokes deliberately” — slash commands with parameters

Why this matters: teams routinely cram everything into tools because tools are what the model sees. But a document your user explicitly selects shouldn’t depend on the model choosing to fetch it — that’s a resource. A standardized “summarize my inbox” action the user triggers is a prompt, not a tool the model might fire at random. Choosing the right primitive is choosing who is in control — model, app, or user — and that’s a product decision, not a technical detail.

Servers aren’t the only side with primitives. Clients can expose:

  • Sampling — a server may ask the host’s model to complete something (“summarize these 50 results before I return them”), keeping the server LLM-free and putting the host in charge of model access, cost and approval.
  • Roots — the host tells the server which directories/scopes it should operate within.

Check your understanding

4 questions · your answers are saved in this browser only

  1. 1. What problem does MCP fundamentally solve?

  2. 2. Your server exposes a list of project documents the USER should pick from and attach to the chat. Which primitive?

  3. 3. What is "sampling" in MCP?

  4. 4. Which transport should a company-wide shared MCP server use?

4. Where the protocol is heading

The talk closes with the roadmap themes that have since largely materialized: a public server registry for discovery, mature OAuth-based authorization for remote servers, elicitation (servers asking the user structured follow-up questions), and ever-tighter agent integration — servers as the standard way agents touch the world. The strategic bet: as agents become the dominant consumers, well-described capabilities become the currency, and MCP is the wire format for them.

Build it yourself

Follow these exact steps to reproduce it yourself · estimated time: ~40 minutes

Prerequisites

  • Python 3.10+
  • Claude Code installed (for step 4; the server itself needs nothing else)

You’ll build a team-notes MCP server that exercises all three primitives, inspect it, and plug it into Claude Code.

Step 1 — Set up

mkdir notes-mcp && cd notes-mcp
python3 -m venv .venv && source .venv/bin/activate
pip install "mcp[cli]"
mkdir notes && echo "Standup: ship the pipeline by Friday." > notes/2026-06-08.md

Step 2 — Write the server (all three primitives)

Create server.py:

from pathlib import Path

from mcp.server.fastmcp import FastMCP

NOTES_DIR = Path(__file__).parent / "notes"
mcp = FastMCP("team-notes")


# TOOL — model-controlled: Claude decides when searching is useful
@mcp.tool()
def search_notes(query: str) -> str:
    """Search all team notes for a case-insensitive text match.
    Returns matching lines with their note date."""
    hits = []
    for note in sorted(NOTES_DIR.glob("*.md")):
        for line in note.read_text().splitlines():
            if query.lower() in line.lower():
                hits.append(f"[{note.stem}] {line.strip()}")
    return "\n".join(hits) or f"No notes match {query!r}."


# TOOL with a write action — keep it separate and obvious
@mcp.tool()
def add_note(date: str, content: str) -> str:
    """Append content to the note for a given date (YYYY-MM-DD)."""
    path = NOTES_DIR / f"{date}.md"
    with path.open("a") as f:
        f.write(content.rstrip() + "\n")
    return f"Added to {path.name}."


# RESOURCE — app-controlled: the host attaches a specific note as context
@mcp.resource("note://{date}")
def get_note(date: str) -> str:
    """The full text of one day's team note."""
    path = NOTES_DIR / f"{date}.md"
    return path.read_text() if path.exists() else f"No note for {date}."


# PROMPT — user-controlled: an explicit recipe the user invokes
@mcp.prompt()
def weekly_summary() -> str:
    """Summarize this week's team notes."""
    return (
        "Use the search_notes tool to gather this week's notes, then write a "
        "5-bullet summary: decisions, blockers, and deadlines."
    )


if __name__ == "__main__":
    mcp.run()  # stdio transport by default

Step 3 — Inspect it before connecting anything

mcp dev server.py

This opens the MCP Inspector in your browser. Confirm: search_notes and add_note under Tools, note://{date} under Resources, weekly_summary under Prompts. Call search_notes with "pipeline" — you should get the standup line back. The inspector is to MCP what curl is to HTTP; always check here first.

Step 4 — Register it with Claude Code

claude mcp add team-notes -- $(pwd)/.venv/bin/python $(pwd)/server.py
claude

In the session, try: “Search our team notes for anything about the pipeline deadline.” Expected result: Claude asks permission to use team-notes:search_notes, calls it, and answers citing the note — your server is now part of the agentic loop. The prompt appears as the /team-notes:weekly_summary command.

Step 5 — Reflect (this is the 201 part)

Notice what you did not build: no model calls, no orchestration, no UI. The server only describes capabilities. Now make the design call the lesson taught: should add_note exist as a tool at all, or should writing notes be a user-invoked prompt? There is no universally right answer — but now it’s a deliberate decision about who controls the action.

Where to go next

Related lessons

intermediate 🎬 Anthropic · ~25 min

Prompting for Agents: Steering Models That Act

Agents are models using tools in a loop. This lesson covers when to build one, how to prompt it — heuristics, budgets, guardrails — and how to evaluate something that takes hundreds of steps.

#agents #prompting #evaluation