Build AI Agents with Claude in Microsoft Azure AI Foundry
A hands-on guide to provisioning Claude in Microsoft Azure AI Foundry, connecting it to MCP servers via Claude Code, and deploying enterprise-grade AI agents β from zero to working code.
This lesson is original educational writing based on this video by Anthropic (published May 20, 2026). All credit for the original content goes to the creators.
1. Why Microsoft Azure AI Foundry?
When enterprises adopt AI, a direct API key wired into a developer laptop is rarely good enough for production. They need audit logs, cost management, data residency guarantees, access control tied to corporate identity, and content moderation that legal can sign off on. Microsoft Azure AI Foundry (previously called Azure AI Studio) is Microsoftβs answer: a managed layer that hosts third-party models β including Claude β inside the customerβs own Azure subscription.
From a developer standpoint, Foundry looks like a thin proxy: you send the same Anthropic messages API requests, but the URL is your Foundry endpoint, and authentication uses an Azure credential instead of an Anthropic API key. The model itself β the weights, the RLHF, the extended thinking β is exactly the same Claude youβd get at api.anthropic.com.
What you actually get with Foundry:
- Data residency β requests stay within your Azure region (e.g. East US 2), never leaving the Microsoft network boundary.
- Managed identity β workloads authenticate via Azure Entra ID (formerly Active Directory), not a shared API key sitting in a
.envfile. - Content filters β Azure AI Content Safety wraps the model; you configure threshold levels per deployment.
- Cost & usage control β token quotas, rate limits and spending caps are enforced per deployment, not across your whole org.
- Private endpoints β lock the Foundry endpoint to your VNet so itβs never reachable from the public internet.
- Integrated monitoring β token consumption, latency percentiles and error rates land in Azure Monitor / Log Analytics alongside your other infra metrics.
2. The Foundry + Claude Architecture
Before writing any code, understand what Foundry is and is not:
- It is not a fine-tuning layer β you deploy the base Claude model, not a custom version.
- It is not a hosted agent runtime β you still build and run the agent code yourself; Foundry just hosts the model inference endpoint.
- It is a managed inference proxy with enterprise guardrails bolted on.
The solid arrows show the prompt/response path. The dashed arrows show the authentication flow (Entra ID issues a short-lived token) and the telemetry path (Foundry emits usage events to Azure Monitor). Your agent code never holds a long-lived Anthropic API key.
3. Provisioning Claude in Foundry
The provisioning flow in the Azure portal takes about ten minutes once you have the prerequisites in place.
Prerequisites
- An Azure subscription with Contributor or Owner role on the target resource group.
- Access to the Azure AI Foundry resource type (you may need to request quota for Claude models β check the Azure AI model catalog).
- Claude models are available through the Azure AI Model Catalog as a Marketplace offering; you accept the Anthropic Terms of Service once during provisioning.
Step-by-step in the portal
-
Create an Azure AI Hub β this is the top-level governance container. One Hub per team or department is the recommended pattern. Set your region here; it determines data residency.
-
Create a Project inside the Hub β projects are workspaces that share Hub-level network and identity settings. Create one project per application or agent.
-
Deploy a Claude model β inside the project, open Model Catalog β Anthropic and click Deploy. Choose the model version (e.g.
claude-sonnet-4-5), deployment name, and tokens-per-minute quota. A few minutes later you have an endpoint URL. -
Note the endpoint and deployment name β the endpoint looks like:
https://<project-name>.openai.azure.com/The deployment name is what you passed in step 3 (e.g.
claude-sonnet-prod). -
Assign a managed identity β for your compute (App Service, Container App, AKS pod, etc.), assign a User-Assigned Managed Identity and grant it the Cognitive Services User role on the Foundry resource. This eliminates the need for any API key in your environment.
4. Connecting Claude Code to Your Foundry Endpoint
Claude Code is itself an MCP host β it talks to Claude to power its own coding assistant capabilities. In an enterprise setting you may want Claude Code to call your Foundry endpoint rather than api.anthropic.com so that all LLM traffic stays inside your Azure perimeter.
Setting the endpoint in Claude Code
Claude Code reads environment variables to configure its API client. Two variables matter:
# The Foundry endpoint for your project
export ANTHROPIC_BASE_URL="https://<your-project>.openai.azure.com/"
# An Azure AD token (or a key-based credential for development only)
export ANTHROPIC_API_KEY="<azure-ad-token-or-foundry-key>"
For development with a key (not recommended for production), grab the key from the Azure portal under Foundry Project β Keys and Endpoint. For production, write a small wrapper that fetches a short-lived Azure AD token using the managed identity:
from azure.identity import DefaultAzureCredential
credential = DefaultAzureCredential()
token = credential.get_token("https://cognitiveservices.azure.com/.default")
print(token.token) # pipe this into ANTHROPIC_API_KEY
DefaultAzureCredential automatically picks up the managed identity when running on Azure compute, or your local az login session during development β the same code works in both places.
Verifying the connection
claude -p "Reply with just the word FOUNDRY if you can hear me"
If you see FOUNDRY printed, your local Claude Code is now routing through your Foundry endpoint.
5. Adding MCP Servers for Real Tool Use
A Claude agent that can only reason is half an agent. The Model Context Protocol (MCP) is how you give Claude access to external tools β databases, APIs, file systems, calendar, code runners. In the Foundry workshop the demo wires up an MCP server so the agent can perform real actions.
How MCP fits into the Foundry setup
Your agent code acts as the MCP host: it connects to one or more MCP servers (local processes or network services), discovers their tools, and includes the tool definitions in every request to Claude via Foundry. When Claude wants to call a tool it returns a tool_use block; your host calls the MCP server and returns the result as a tool_result message. Foundry sees only the messages API traffic β it has no awareness of MCP.
User prompt
β
βΌ
Agent code (MCP host)
ββ discovers tools from MCP servers
ββ sends prompt + tool list β Foundry β Claude
β
Claude returns tool_use block
β
ββ agent calls MCP server tool
ββ sends tool_result β Foundry β Claude
β
Claude returns final answer
β
βΌ
User response
Connecting Claude Code to an MCP server
Claude Code ships with built-in MCP support. Add a server with:
claude mcp add filesystem --command "npx" --args "@modelcontextprotocol/server-filesystem" "/tmp/workspace"
Or edit .claude/mcp.json directly for more complex configurations:
{
"mcpServers": {
"filesystem": {
"command": "npx",
"args": ["@modelcontextprotocol/server-filesystem", "/tmp/workspace"]
},
"postgres": {
"command": "npx",
"args": ["@modelcontextprotocol/server-postgres"],
"env": {
"POSTGRES_URL": "postgresql://localhost/mydb"
}
}
}
}
Claude Code discovers these servers on startup, lists their tools, and automatically includes them in its context. When you ask Claude to βquery the customers table,β it knows it has a query tool available and uses it.
The cupcake demo
The workshopβs live demo builds an agent that uses a file-system MCP server to track a βcupcake orderβ: it reads from and writes to files in a workspace directory, confirming order details before committing them. The point isnβt cupcakes β itβs showing that the agent can read state, reason about it, and write changes through a real tool rather than hallucinating file contents.
Check your understanding
5 questions Β· your answers are saved in this browser only
-
1. What is the primary reason enterprises choose Azure AI Foundry over direct Anthropic API access?
-
2. Which environment variable tells Claude Code to send its LLM requests to your Foundry endpoint instead of api.anthropic.com?
-
3. In the Foundry + MCP architecture, which component is aware of MCP tool definitions?
-
4. What does DefaultAzureCredential do when your code runs on an Azure VM or Container App that has a managed identity assigned?
-
5. An enterprise wants to ensure Claude never processes requests that exceed their content policy. Where in the Foundry architecture should they configure this?
6. Enterprise Security Patterns
The workshop covers several production-hardening techniques worth understanding before you ship.
Private endpoints
By default, your Foundry endpoint is a public HTTPS URL. In a high-security environment, create a Private Endpoint in your VNet. Traffic from your app to Claude now flows through the Microsoft backbone network; the endpoint is unreachable from the public internet. This pairs naturally with App Service / AKS behind an internal load balancer.
Network rules and IP allowlists
If a full private endpoint is too complex for your use case, lock the Foundry deployment to a set of known egress IPs (e.g. your NAT gatewayβs IP). This prevents credential theft from exposing the endpoint to arbitrary callers.
Role-based access control (RBAC)
Two roles matter:
| Role | Who gets it |
|---|---|
| Cognitive Services User | The managed identity of compute that calls the endpoint (reads/uses the model) |
| Cognitive Services Contributor | The team that manages deployments, quotas and content filters (not the app) |
Never give application compute Contributor rights β least privilege applies here.
Content safety tiers
Foundryβs built-in content filter has four severity levels (safe, low, medium, high) across four categories (hate, sexual, violence, self-harm). For each category you configure an action (block or flag). A reasonable enterprise default:
- Block at medium for all categories in user-facing applications.
- Flag (but donβt block) at low for internal developer tools where you want visibility without friction.
- Log everything to Azure Monitor for your SOC team.
Secrets management
Even if you use managed identity for the prod app, developers still need credentials locally. The recommended pattern:
- Developers run
az loginβ no secrets in their env. DefaultAzureCredentialpicks up the localaz loginsession.- CI/CD pipelines use a workload identity federation (GitHub Actions OIDC β Azure AD) β still no secrets.
- If you must use a key (e.g. testing from a non-Azure machine), store it in Azure Key Vault and fetch it at startup, never in environment variables or
.envfiles.
Build it yourself
Follow these exact steps to reproduce it yourself Β· estimated time: ~30 min
Prerequisites
- An Azure subscription with Contributor access to a resource group
- Azure CLI installed and authenticated (`az login`)
- Node.js 18+ and Python 3.11+ installed
- Claude Code installed (`npm install -g @anthropic-ai/claude-code`)
Step 1 β Create an Azure AI Hub and Project
# Create a resource group (choose your region)
az group create --name rg-claude-foundry --location eastus2
# Create the AI Hub
az cognitiveservices account create \
--name hub-claude-demo \
--resource-group rg-claude-foundry \
--kind AIServices \
--sku S0 \
--location eastus2 \
--yes
# Note the endpoint URL from the output
az cognitiveservices account show \
--name hub-claude-demo \
--resource-group rg-claude-foundry \
--query "properties.endpoint" -o tsvAlternatively, create the Hub through the Azure portal at ai.azure.com β the UI wizard is more forgiving for first-timers.
Step 2 β Deploy a Claude model
In the Azure portal, navigate to your AI Hub β Model Catalog β search for Claude β select Claude Sonnet (or any available version) β click Deploy.
- Deployment name:
claude-sonnet-prod - Tokens per minute: start with 40,000 (you can increase later)
- Accept the Anthropic terms of service
Once deployed, copy the endpoint URL and deployment name from the deployment details page.
Step 3 β Set up credentials locally
# Confirm you're logged in to the right Azure subscription
az account show
# Fetch an access token for Cognitive Services
TOKEN=$(az account get-access-token \
--resource https://cognitiveservices.azure.com/ \
--query accessToken -o tsv)
# Export for the Anthropic SDK
export ANTHROPIC_API_KEY="$TOKEN"
export ANTHROPIC_BASE_URL="https://<your-hub-name>.openai.azure.com/"For a persistent dev setup, add a function to your shell profile that refreshes the token on demand β Azure tokens expire in ~1 hour.
Step 4 β Verify Claude responds through Foundry
claude -p "Say FOUNDRY_CONNECTED if you can hear me"You should see FOUNDRY_CONNECTED. If you get an auth error, double-check ANTHROPIC_BASE_URL ends with a trailing slash and the token hasnβt expired.
Step 5 β Add a file-system MCP server
# Install the reference file-system MCP server
npm install -g @modelcontextprotocol/server-filesystem
# Add it to Claude Code
claude mcp add workspace \
--command npx \
--args "@modelcontextprotocol/server-filesystem" \
"$HOME/foundry-workspace"
# Create the workspace directory
mkdir -p "$HOME/foundry-workspace"Restart Claude Code (claude) and verify the tool is available:
/mcpYou should see workspace listed with its available tools (read_file, write_file, list_directory, etc.).
Step 6 β Build the agent: a simple order tracker
Create a Python script that mimics the workshopβs cupcake demo concept β an agent that manages a simple stateful task through MCP tools:
import anthropic
import json
import subprocess
import os
# Client automatically uses ANTHROPIC_BASE_URL and ANTHROPIC_API_KEY from env
client = anthropic.Anthropic()
WORKSPACE = os.path.expanduser("~/foundry-workspace")
ORDER_FILE = os.path.join(WORKSPACE, "order.json")
# Define tools (in production, discover these from your MCP server)
tools = [
{
"name": "read_order",
"description": "Read the current order state from disk",
"input_schema": {"type": "object", "properties": {}, "required": []},
},
{
"name": "write_order",
"description": "Write updated order state to disk",
"input_schema": {
"type": "object",
"properties": {
"order": {"type": "object", "description": "The order data to save"}
},
"required": ["order"],
},
},
]
def handle_tool(name, tool_input):
if name == "read_order":
if os.path.exists(ORDER_FILE):
with open(ORDER_FILE) as f:
return json.load(f)
return {}
elif name == "write_order":
with open(ORDER_FILE, "w") as f:
json.dump(tool_input["order"], f, indent=2)
return {"status": "saved"}
def run_agent(user_message):
messages = [{"role": "user", "content": user_message}]
while True:
response = client.messages.create(
model="claude-sonnet-prod", # your Foundry deployment name
max_tokens=1024,
tools=tools,
messages=messages,
)
messages.append({"role": "assistant", "content": response.content})
if response.stop_reason == "end_turn":
# Extract text response
for block in response.content:
if hasattr(block, "text"):
print(f"Agent: {block.text}")
break
if response.stop_reason == "tool_use":
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = handle_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": json.dumps(result),
})
messages.append({"role": "user", "content": tool_results})
# Try it
run_agent("I'd like to order 12 vanilla cupcakes for Friday. Check if there's an existing order first, then create or update it.")python3 agent.pyExpected output: The agent reads the (empty) order file, creates a new order with your request, writes it to disk, and reports back. Run it again with βadd 6 chocolate cupcakes to my orderβ and it will read the existing state and update it.
Step 7 β Review the audit log
# View recent Foundry API calls in Azure Monitor (requires az extension)
az monitor activity-log list \
--resource-group rg-claude-foundry \
--start-time "$(date -u -v-1H +%Y-%m-%dT%H:%M:%SZ)" \
--query "[].{time:eventTimestamp, op:operationName.localizedValue}" \
--output tableYou should see entries for the model inference calls your agent made. In a production setup, these feed into your SIEM.
Checkpoint: You now have a Claude agent running through your Azure Foundry endpoint, backed by a real MCP tool, with audit logs flowing to Azure Monitor. The same pattern scales to any MCP server β databases, APIs, calendars, code runners.
Where to go next
- Microsoft Azure AI Foundry documentation β the official reference for Hub/Project setup, content filters, and private endpoint configuration.
- Model Context Protocol β understand how MCP works under the hood before you build your own server.
- Building Effective Agents β the architectural patterns (orchestrators, subagents, tool loops) that apply whether youβre on Foundry, AWS Bedrock, or direct API.
- Claude on AWS β the equivalent workshop for Amazon Bedrock, useful if your organization uses AWS instead of Azure.