AI Learning
intermediate ⏱️ 16 min read · 🎬 ~34 min video

Build AI Agents with Claude in Microsoft Azure AI Foundry

A hands-on guide to provisioning Claude in Microsoft Azure AI Foundry, connecting it to MCP servers via Claude Code, and deploying enterprise-grade AI agents β€” from zero to working code.

This lesson is original educational writing based on this video by Anthropic (published May 20, 2026). All credit for the original content goes to the creators.

#agents #enterprise #productivity
Video thumbnail: Build AI Agents with Claude in Microsoft Azure AI Foundry
Original video β€” all credit to the creators. Watch the original on YouTube β†—

1. Why Microsoft Azure AI Foundry?

When enterprises adopt AI, a direct API key wired into a developer laptop is rarely good enough for production. They need audit logs, cost management, data residency guarantees, access control tied to corporate identity, and content moderation that legal can sign off on. Microsoft Azure AI Foundry (previously called Azure AI Studio) is Microsoft’s answer: a managed layer that hosts third-party models β€” including Claude β€” inside the customer’s own Azure subscription.

From a developer standpoint, Foundry looks like a thin proxy: you send the same Anthropic messages API requests, but the URL is your Foundry endpoint, and authentication uses an Azure credential instead of an Anthropic API key. The model itself β€” the weights, the RLHF, the extended thinking β€” is exactly the same Claude you’d get at api.anthropic.com.

What you actually get with Foundry:

  • Data residency β€” requests stay within your Azure region (e.g. East US 2), never leaving the Microsoft network boundary.
  • Managed identity β€” workloads authenticate via Azure Entra ID (formerly Active Directory), not a shared API key sitting in a .env file.
  • Content filters β€” Azure AI Content Safety wraps the model; you configure threshold levels per deployment.
  • Cost & usage control β€” token quotas, rate limits and spending caps are enforced per deployment, not across your whole org.
  • Private endpoints β€” lock the Foundry endpoint to your VNet so it’s never reachable from the public internet.
  • Integrated monitoring β€” token consumption, latency percentiles and error rates land in Azure Monitor / Log Analytics alongside your other infra metrics.

2. The Foundry + Claude Architecture

Before writing any code, understand what Foundry is and is not:

  • It is not a fine-tuning layer β€” you deploy the base Claude model, not a custom version.
  • It is not a hosted agent runtime β€” you still build and run the agent code yourself; Foundry just hosts the model inference endpoint.
  • It is a managed inference proxy with enterprise guardrails bolted on.
Azure Subscription (customer tenant)Agent CodeClaude Code / SDKAzure Entra IDManaged IdentityFoundry EndpointContent Filter Β· QuotaClaude ModelSonnet / Opus / HaikuAzure MonitorLogs Β· Metrics Β· Alerts
Request flow from agent code through Azure AI Foundry to Claude and back. Every hop inside the dashed boundary stays within the customer's Azure subscription.

The solid arrows show the prompt/response path. The dashed arrows show the authentication flow (Entra ID issues a short-lived token) and the telemetry path (Foundry emits usage events to Azure Monitor). Your agent code never holds a long-lived Anthropic API key.

3. Provisioning Claude in Foundry

The provisioning flow in the Azure portal takes about ten minutes once you have the prerequisites in place.

Prerequisites

  • An Azure subscription with Contributor or Owner role on the target resource group.
  • Access to the Azure AI Foundry resource type (you may need to request quota for Claude models β€” check the Azure AI model catalog).
  • Claude models are available through the Azure AI Model Catalog as a Marketplace offering; you accept the Anthropic Terms of Service once during provisioning.

Step-by-step in the portal

  1. Create an Azure AI Hub β€” this is the top-level governance container. One Hub per team or department is the recommended pattern. Set your region here; it determines data residency.

  2. Create a Project inside the Hub β€” projects are workspaces that share Hub-level network and identity settings. Create one project per application or agent.

  3. Deploy a Claude model β€” inside the project, open Model Catalog β†’ Anthropic and click Deploy. Choose the model version (e.g. claude-sonnet-4-5), deployment name, and tokens-per-minute quota. A few minutes later you have an endpoint URL.

  4. Note the endpoint and deployment name β€” the endpoint looks like:

    https://<project-name>.openai.azure.com/

    The deployment name is what you passed in step 3 (e.g. claude-sonnet-prod).

  5. Assign a managed identity β€” for your compute (App Service, Container App, AKS pod, etc.), assign a User-Assigned Managed Identity and grant it the Cognitive Services User role on the Foundry resource. This eliminates the need for any API key in your environment.

4. Connecting Claude Code to Your Foundry Endpoint

Claude Code is itself an MCP host β€” it talks to Claude to power its own coding assistant capabilities. In an enterprise setting you may want Claude Code to call your Foundry endpoint rather than api.anthropic.com so that all LLM traffic stays inside your Azure perimeter.

Setting the endpoint in Claude Code

Claude Code reads environment variables to configure its API client. Two variables matter:

# The Foundry endpoint for your project
export ANTHROPIC_BASE_URL="https://<your-project>.openai.azure.com/"

# An Azure AD token (or a key-based credential for development only)
export ANTHROPIC_API_KEY="<azure-ad-token-or-foundry-key>"

For development with a key (not recommended for production), grab the key from the Azure portal under Foundry Project β†’ Keys and Endpoint. For production, write a small wrapper that fetches a short-lived Azure AD token using the managed identity:

from azure.identity import DefaultAzureCredential

credential = DefaultAzureCredential()
token = credential.get_token("https://cognitiveservices.azure.com/.default")
print(token.token)   # pipe this into ANTHROPIC_API_KEY

DefaultAzureCredential automatically picks up the managed identity when running on Azure compute, or your local az login session during development β€” the same code works in both places.

Verifying the connection

claude -p "Reply with just the word FOUNDRY if you can hear me"

If you see FOUNDRY printed, your local Claude Code is now routing through your Foundry endpoint.

5. Adding MCP Servers for Real Tool Use

A Claude agent that can only reason is half an agent. The Model Context Protocol (MCP) is how you give Claude access to external tools β€” databases, APIs, file systems, calendar, code runners. In the Foundry workshop the demo wires up an MCP server so the agent can perform real actions.

How MCP fits into the Foundry setup

Your agent code acts as the MCP host: it connects to one or more MCP servers (local processes or network services), discovers their tools, and includes the tool definitions in every request to Claude via Foundry. When Claude wants to call a tool it returns a tool_use block; your host calls the MCP server and returns the result as a tool_result message. Foundry sees only the messages API traffic β€” it has no awareness of MCP.

User prompt
   β”‚
   β–Ό
Agent code (MCP host)
   β”œβ”€ discovers tools from MCP servers
   β”œβ”€ sends prompt + tool list β†’ Foundry β†’ Claude
   β”‚
   Claude returns tool_use block
   β”‚
   β”œβ”€ agent calls MCP server tool
   β”œβ”€ sends tool_result β†’ Foundry β†’ Claude
   β”‚
   Claude returns final answer
   β”‚
   β–Ό
User response

Connecting Claude Code to an MCP server

Claude Code ships with built-in MCP support. Add a server with:

claude mcp add filesystem --command "npx" --args "@modelcontextprotocol/server-filesystem" "/tmp/workspace"

Or edit .claude/mcp.json directly for more complex configurations:

{
  "mcpServers": {
    "filesystem": {
      "command": "npx",
      "args": ["@modelcontextprotocol/server-filesystem", "/tmp/workspace"]
    },
    "postgres": {
      "command": "npx",
      "args": ["@modelcontextprotocol/server-postgres"],
      "env": {
        "POSTGRES_URL": "postgresql://localhost/mydb"
      }
    }
  }
}

Claude Code discovers these servers on startup, lists their tools, and automatically includes them in its context. When you ask Claude to β€œquery the customers table,” it knows it has a query tool available and uses it.

The cupcake demo

The workshop’s live demo builds an agent that uses a file-system MCP server to track a β€œcupcake order”: it reads from and writes to files in a workspace directory, confirming order details before committing them. The point isn’t cupcakes β€” it’s showing that the agent can read state, reason about it, and write changes through a real tool rather than hallucinating file contents.

Check your understanding

5 questions Β· your answers are saved in this browser only

  1. 1. What is the primary reason enterprises choose Azure AI Foundry over direct Anthropic API access?

  2. 2. Which environment variable tells Claude Code to send its LLM requests to your Foundry endpoint instead of api.anthropic.com?

  3. 3. In the Foundry + MCP architecture, which component is aware of MCP tool definitions?

  4. 4. What does DefaultAzureCredential do when your code runs on an Azure VM or Container App that has a managed identity assigned?

  5. 5. An enterprise wants to ensure Claude never processes requests that exceed their content policy. Where in the Foundry architecture should they configure this?

6. Enterprise Security Patterns

The workshop covers several production-hardening techniques worth understanding before you ship.

Private endpoints

By default, your Foundry endpoint is a public HTTPS URL. In a high-security environment, create a Private Endpoint in your VNet. Traffic from your app to Claude now flows through the Microsoft backbone network; the endpoint is unreachable from the public internet. This pairs naturally with App Service / AKS behind an internal load balancer.

Network rules and IP allowlists

If a full private endpoint is too complex for your use case, lock the Foundry deployment to a set of known egress IPs (e.g. your NAT gateway’s IP). This prevents credential theft from exposing the endpoint to arbitrary callers.

Role-based access control (RBAC)

Two roles matter:

RoleWho gets it
Cognitive Services UserThe managed identity of compute that calls the endpoint (reads/uses the model)
Cognitive Services ContributorThe team that manages deployments, quotas and content filters (not the app)

Never give application compute Contributor rights β€” least privilege applies here.

Content safety tiers

Foundry’s built-in content filter has four severity levels (safe, low, medium, high) across four categories (hate, sexual, violence, self-harm). For each category you configure an action (block or flag). A reasonable enterprise default:

  • Block at medium for all categories in user-facing applications.
  • Flag (but don’t block) at low for internal developer tools where you want visibility without friction.
  • Log everything to Azure Monitor for your SOC team.

Secrets management

Even if you use managed identity for the prod app, developers still need credentials locally. The recommended pattern:

  1. Developers run az login β€” no secrets in their env.
  2. DefaultAzureCredential picks up the local az login session.
  3. CI/CD pipelines use a workload identity federation (GitHub Actions OIDC β†’ Azure AD) β€” still no secrets.
  4. If you must use a key (e.g. testing from a non-Azure machine), store it in Azure Key Vault and fetch it at startup, never in environment variables or .env files.

Build it yourself

Follow these exact steps to reproduce it yourself Β· estimated time: ~30 min

Prerequisites

  • An Azure subscription with Contributor access to a resource group
  • Azure CLI installed and authenticated (`az login`)
  • Node.js 18+ and Python 3.11+ installed
  • Claude Code installed (`npm install -g @anthropic-ai/claude-code`)

Step 1 β€” Create an Azure AI Hub and Project

# Create a resource group (choose your region)
az group create --name rg-claude-foundry --location eastus2

# Create the AI Hub
az cognitiveservices account create \
  --name hub-claude-demo \
  --resource-group rg-claude-foundry \
  --kind AIServices \
  --sku S0 \
  --location eastus2 \
  --yes

# Note the endpoint URL from the output
az cognitiveservices account show \
  --name hub-claude-demo \
  --resource-group rg-claude-foundry \
  --query "properties.endpoint" -o tsv

Alternatively, create the Hub through the Azure portal at ai.azure.com β€” the UI wizard is more forgiving for first-timers.

Step 2 β€” Deploy a Claude model

In the Azure portal, navigate to your AI Hub β†’ Model Catalog β†’ search for Claude β†’ select Claude Sonnet (or any available version) β†’ click Deploy.

  • Deployment name: claude-sonnet-prod
  • Tokens per minute: start with 40,000 (you can increase later)
  • Accept the Anthropic terms of service

Once deployed, copy the endpoint URL and deployment name from the deployment details page.

Step 3 β€” Set up credentials locally

# Confirm you're logged in to the right Azure subscription
az account show

# Fetch an access token for Cognitive Services
TOKEN=$(az account get-access-token \
  --resource https://cognitiveservices.azure.com/ \
  --query accessToken -o tsv)

# Export for the Anthropic SDK
export ANTHROPIC_API_KEY="$TOKEN"
export ANTHROPIC_BASE_URL="https://<your-hub-name>.openai.azure.com/"

For a persistent dev setup, add a function to your shell profile that refreshes the token on demand β€” Azure tokens expire in ~1 hour.

Step 4 β€” Verify Claude responds through Foundry

claude -p "Say FOUNDRY_CONNECTED if you can hear me"

You should see FOUNDRY_CONNECTED. If you get an auth error, double-check ANTHROPIC_BASE_URL ends with a trailing slash and the token hasn’t expired.

Step 5 β€” Add a file-system MCP server

# Install the reference file-system MCP server
npm install -g @modelcontextprotocol/server-filesystem

# Add it to Claude Code
claude mcp add workspace \
  --command npx \
  --args "@modelcontextprotocol/server-filesystem" \
  "$HOME/foundry-workspace"

# Create the workspace directory
mkdir -p "$HOME/foundry-workspace"

Restart Claude Code (claude) and verify the tool is available:

/mcp

You should see workspace listed with its available tools (read_file, write_file, list_directory, etc.).

Step 6 β€” Build the agent: a simple order tracker

Create a Python script that mimics the workshop’s cupcake demo concept β€” an agent that manages a simple stateful task through MCP tools:

import anthropic
import json
import subprocess
import os

# Client automatically uses ANTHROPIC_BASE_URL and ANTHROPIC_API_KEY from env
client = anthropic.Anthropic()

WORKSPACE = os.path.expanduser("~/foundry-workspace")
ORDER_FILE = os.path.join(WORKSPACE, "order.json")

# Define tools (in production, discover these from your MCP server)
tools = [
    {
        "name": "read_order",
        "description": "Read the current order state from disk",
        "input_schema": {"type": "object", "properties": {}, "required": []},
    },
    {
        "name": "write_order",
        "description": "Write updated order state to disk",
        "input_schema": {
            "type": "object",
            "properties": {
                "order": {"type": "object", "description": "The order data to save"}
            },
            "required": ["order"],
        },
    },
]

def handle_tool(name, tool_input):
    if name == "read_order":
        if os.path.exists(ORDER_FILE):
            with open(ORDER_FILE) as f:
                return json.load(f)
        return {}
    elif name == "write_order":
        with open(ORDER_FILE, "w") as f:
            json.dump(tool_input["order"], f, indent=2)
        return {"status": "saved"}

def run_agent(user_message):
    messages = [{"role": "user", "content": user_message}]
    
    while True:
        response = client.messages.create(
            model="claude-sonnet-prod",  # your Foundry deployment name
            max_tokens=1024,
            tools=tools,
            messages=messages,
        )
        
        messages.append({"role": "assistant", "content": response.content})
        
        if response.stop_reason == "end_turn":
            # Extract text response
            for block in response.content:
                if hasattr(block, "text"):
                    print(f"Agent: {block.text}")
            break
        
        if response.stop_reason == "tool_use":
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = handle_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": json.dumps(result),
                    })
            messages.append({"role": "user", "content": tool_results})

# Try it
run_agent("I'd like to order 12 vanilla cupcakes for Friday. Check if there's an existing order first, then create or update it.")
python3 agent.py

Expected output: The agent reads the (empty) order file, creates a new order with your request, writes it to disk, and reports back. Run it again with β€œadd 6 chocolate cupcakes to my order” and it will read the existing state and update it.

Step 7 β€” Review the audit log

# View recent Foundry API calls in Azure Monitor (requires az extension)
az monitor activity-log list \
  --resource-group rg-claude-foundry \
  --start-time "$(date -u -v-1H +%Y-%m-%dT%H:%M:%SZ)" \
  --query "[].{time:eventTimestamp, op:operationName.localizedValue}" \
  --output table

You should see entries for the model inference calls your agent made. In a production setup, these feed into your SIEM.

Checkpoint: You now have a Claude agent running through your Azure Foundry endpoint, backed by a real MCP tool, with audit logs flowing to Azure Monitor. The same pattern scales to any MCP server β€” databases, APIs, calendars, code runners.

Where to go next

  • Microsoft Azure AI Foundry documentation β€” the official reference for Hub/Project setup, content filters, and private endpoint configuration.
  • Model Context Protocol β€” understand how MCP works under the hood before you build your own server.
  • Building Effective Agents β€” the architectural patterns (orchestrators, subagents, tool loops) that apply whether you’re on Foundry, AWS Bedrock, or direct API.
  • Claude on AWS β€” the equivalent workshop for Amazon Bedrock, useful if your organization uses AWS instead of Azure.

Related lessons

beginner 🎬 Anthropic · ~2 min

How Anthropic's GTM Engineering Team Uses Claude

Sales reps drown in administrative work β€” digging through scattered documentation to answering customer emails late into the night. Jared Sires, GTM Product Manager, shares how he went from account prep to customer follow-ups with Claude.

#productivity #enterprise #case-study