AI with Claude on AWS: From Code to Orchestration
Stand up Claude Code on Amazon Bedrock, teach it your team's conventions with CLAUDE.md and Agent Skills, then graduate to full multi-step orchestration with Lambda and Step Functions.
This lesson is original educational writing based on this video by Anthropic (published May 20, 2026). All credit for the original content goes to the creators.
1. Why Bedrock? The enterprise case for routing Claude through AWS
When you call claude from your terminal, requests go to Anthropic’s API by default. For personal or small-team use that is fine. For enterprise or regulated environments it often is not, for three reasons:
Data residency and security. Bedrock keeps inference traffic inside your AWS VPC. Data never leaves your account boundary, which satisfies most data-protection requirements without any custom proxy work.
IAM governance. Every Bedrock call is an IAM action. That means you get per-user CloudTrail audit logs, service-control policies, and cost allocation by team — the same controls you already apply to every other AWS service.
Unified billing. One AWS invoice. No separate Anthropic subscription to manage, no token quotas siloed from your cloud budget.
The trade-off is a few milliseconds of extra latency (typically 2–5 s for Sonnet on a simple query vs. the Anthropic API directly) and the overhead of managing AWS credentials. For most enterprise scenarios, that is an easy trade.
2. Switching Claude Code to Bedrock
Claude Code speaks to Bedrock through two environment variables. No code changes, no SDKs to swap.
export CLAUDE_CODE_USE_BEDROCK=1
export AWS_REGION=us-east-1
Or persist them in ~/.claude/settings.json so every session picks them up automatically:
{
"awsAuthRefresh": "aws login",
"env": {
"AWS_REGION": "us-east-1",
"CLAUDE_CODE_USE_BEDROCK": "1"
}
}
awsAuthRefresh tells Claude Code to re-run aws login whenever credentials expire — useful with IAM Identity Center (SSO) sessions that rotate every few hours.
Authentication options
| Method | When to use |
|---|---|
aws login (AWS CLI v2.32+) | Local development, interactive |
aws sso login --profile <name> | IAM Identity Center / federated identity |
| Instance profile / ECS task role | Lambda functions, ECS containers, EC2 |
AWS_BEARER_TOKEN_BEDROCK env var | Quick tests; no per-user CloudTrail audit |
IAM permissions required
Your IAM principal needs at minimum:
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:InvokeModelWithResponseStream"
],
"Resource": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-*"
}
InvokeModelWithResponseStream is required for streaming responses. Claude Code will silently freeze without it.
Choosing a model tier
Use /model inside Claude Code to switch on the fly. For programmatic defaults:
{
"env": {
"ANTHROPIC_DEFAULT_SONNET_MODEL": "us.anthropic.claude-sonnet-4-5-20250929-v1:0",
"ANTHROPIC_DEFAULT_HAIKU_MODEL": "us.anthropic.claude-haiku-4-5-20251001-v1:0"
}
}
Haiku is 10–12× cheaper than Sonnet and handles routine edits, linting, and short Q&A well. Sonnet is the default sweet spot. Opus for architecture decisions and complex multi-file refactors. A dual-model strategy — Haiku for background tasks, Sonnet for interactive work — can cut token costs by 40–60% in practice.
3. Team conventions: CLAUDE.md and Agent Skills
Switching to Bedrock gives you infrastructure. The next step is encoding your team’s knowledge so every developer benefits from the same institutional context.
CLAUDE.md at scale
Claude Code reads CLAUDE.md from the project root (and parent directories up to ~). For a team on Bedrock, a shared project CLAUDE.md becomes the single source of truth for how Claude should behave on this codebase:
# CLAUDE.md
## AWS Environment
- Region: us-east-1 (primary), us-west-2 (DR)
- Account IDs: prod=123456789012 / staging=234567890123
- Deploy via CDK: `cdk deploy --all`
## Conventions
- Lambda functions: Python 3.12, arm64, 512 MB default
- IaC: CDK TypeScript (no CloudFormation raw templates)
- Secrets: Secrets Manager only — never env vars in Lambda config
- Naming: `{team}-{service}-{env}` (e.g. platform-auth-prod)
## Must-not-touch
- Do not modify the shared VPC stack (vpc-stack.ts) without infra-team review
Commit this file. Every developer who clones the repo gets the same Claude context automatically, without any personal setup.
Agent Skills
Agent Skills are modular, reusable capability files that live in .claude/skills/ (or a shared package your team installs). Where CLAUDE.md holds project context, Skills hold domain expertise — deep knowledge of a particular technology or workflow packaged for reuse across projects.
A typical AWS team might maintain Skills for:
- CDK patterns — preferred constructs, tagging standards, stack naming
- Bedrock AgentCore — how to register, version and invoke agents
- Serverless cost ops — how to profile Lambda cold starts, set memory correctly, use Graviton
Skills compose with CLAUDE.md: CLAUDE.md says what the project is, Skills provide expertise on how to build things correctly within it. Together they replace the implicit knowledge that usually lives in a senior engineer’s head.
4. Calling Claude from Lambda: the two APIs
Once Claude Code is handling developer workflows, the next step is putting Claude into production AWS workloads. Lambda is the most common entry point.
The Converse API (recommended)
The Bedrock Converse API is model-agnostic: the same code works with Claude, Titan, Llama, and any other Bedrock model. Use it for new code.
import boto3, json
client = boto3.client('bedrock-runtime', region_name='us-east-1')
response = client.converse(
modelId='anthropic.claude-sonnet-4-5-20250929-v1:0',
messages=[{
'role': 'user',
'content': [{'text': 'Summarize this support ticket: ' + ticket_text}]
}],
inferenceConfig={'maxTokens': 512, 'temperature': 0.1}
)
answer = response['output']['message']['content'][0]['text']
The InvokeModel API (Claude-specific)
invoke_model gives you direct access to Claude’s native request format. Use it when you need features that haven’t landed in Converse yet (extended thinking, certain tool-use patterns).
response = client.invoke_model(
modelId='anthropic.claude-sonnet-4-5-20250929-v1:0',
body=json.dumps({
'anthropic_version': 'bedrock-2023-05-31',
'messages': [{'role': 'user', 'content': ticket_text}],
'max_tokens': 512,
'thinking': {'type': 'enabled', 'budget_tokens': 2000}
})
)
result = json.loads(response['body'].read())
Lambda configuration for Bedrock calls
Key settings for a production Lambda that calls Bedrock:
| Setting | Recommended value | Reason |
|---|---|---|
| Runtime | Python 3.12, arm64 | Cost, performance |
| Memory | 512 MB (tune up if slow) | Bedrock calls are I/O-bound, not CPU-bound |
| Timeout | 60–300 s | Long reasoning chains can take 30+ s |
| Execution role | bedrock:InvokeModel + bedrock:InvokeModelWithResponseStream | Minimum required |
| Retry policy | Exponential backoff, max 3 | Bedrock throttles at burst limits |
Check your understanding
2 questions · your answers are saved in this browser only
-
1. Which IAM permission is commonly missed and causes Claude Code to silently freeze when connected to Bedrock?
-
2. What is the primary advantage of the Bedrock Converse API over InvokeModel?
5. Orchestration with Step Functions
A single Lambda call handles stateless, single-turn Claude interactions well. When you need multi-step workflows — document ingestion, human review gates, parallel analysis branches, retry on failure — AWS Step Functions is the right tool.
Why Step Functions beats custom retry logic in Lambda
| Concern | DIY in Lambda | Step Functions |
|---|---|---|
| Retry on throttle | Custom sleep loop | Built-in Retry with exponential backoff |
| Parallel tasks | Thread pools, complexity | Parallel state, managed |
| Human approval gate | Polling loop or SQS | Wait for task token built-in |
| Audit trail | CloudWatch logs | Full execution history in console |
| Timeout per step | One Lambda timeout | Per-state timeouts |
A minimal document-analysis state machine
{
"Comment": "Analyse a document with Claude via Bedrock",
"StartAt": "FetchDocument",
"States": {
"FetchDocument": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456:function:fetch-doc",
"Next": "AnalyseWithClaude"
},
"AnalyseWithClaude": {
"Type": "Task",
"Resource": "arn:aws:states:::bedrock:invokeModel",
"Parameters": {
"ModelId": "anthropic.claude-sonnet-4-5-20250929-v1:0",
"Body": {
"anthropic_version": "bedrock-2023-05-31",
"messages": [{"role": "user", "content.$": "States.Format('Analyse this document: {}', $.documentText)"}],
"max_tokens": 1024
}
},
"Retry": [{"ErrorEquals": ["States.TaskFailed"], "IntervalSeconds": 5, "MaxAttempts": 3, "BackoffRate": 2}],
"Next": "StoreResult"
},
"StoreResult": {
"Type": "Task",
"Resource": "arn:aws:lambda:us-east-1:123456:function:store-result",
"End": true
}
}
}
Step Functions has a native arn:aws:states:::bedrock:invokeModel integration — you can call Claude directly from a state definition without writing a Lambda wrapper for that step.
When to choose what
- Single Lambda — one-turn response, user-facing latency matters, stateless
- Lambda + Step Functions — multi-step pipeline, retries, parallel branches, human gates
- Bedrock Agents — open-ended tool-use where the model itself decides which steps to run and in what order
Check your understanding
3 questions · your answers are saved in this browser only
-
1. You need Claude to analyse 200 documents in parallel and fan the results into a single summary. Which AWS pattern fits best?
-
2. Step Functions has a native integration resource for Bedrock (arn:aws:states:::bedrock:invokeModel). What does this mean in practice?
-
3. Which component is most appropriate when you need Claude to autonomously decide which tools to call and in what order across an open-ended task?
6. Cost control and production readiness
Running Claude at scale on Bedrock requires attention to a few cost and reliability levers.
Prompt caching
Bedrock enables prompt caching by default for Claude models. Repeated system prompts (CLAUDE.md content, tool definitions) can reduce input token costs by up to 90% when the same prefix is reused across calls in the same session.
.claudeignore
Just like .gitignore, .claudeignore tells Claude Code which files to skip when indexing a project. Excluding node_modules/, dist/, *.log, and large data files keeps context windows focused and tokens cheap.
# .claudeignore
node_modules/
dist/
cdk.out/
*.log
*.parquet
__pycache__/
CloudWatch alarms for token spend
Bedrock surfaces usage metrics in CloudWatch. Set an alarm on InvocationLatency and on estimated token spend (via Cost Explorer’s daily budget alerts) so runaway agent loops don’t generate surprise bills.
Dual-model strategy
Assign model tiers to task types in your team CLAUDE.md:
## Model selection policy
- Routine edits, lint fixes, docs: use Haiku (cheapest)
- Interactive coding sessions: use Sonnet (default)
- Architecture review, complex refactors: use Opus
Teams that enforce this consciously typically cut token costs 40–60% compared to always using Sonnet.
Build it yourself
Follow these exact steps to reproduce it yourself · estimated time: ~25 min
Prerequisites
- AWS CLI v2.32+ installed and configured (`aws --version`)
- Claude Code installed (`claude --version`)
- An IAM user or role with `bedrock:InvokeModel` and `bedrock:InvokeModelWithResponseStream` permissions
- Amazon Bedrock enabled in your chosen region (us-east-1 or us-west-2 recommended)
- A code project to experiment in
Step 1 — Enable model access in Bedrock
Open the Amazon Bedrock console, go to Model access, and request access to the Claude models you plan to use (Sonnet is a good starting point). Access is usually granted within a few minutes.
Step 2 — Configure Claude Code for Bedrock
Create or update ~/.claude/settings.json:
{
"awsAuthRefresh": "aws sso login --profile your-profile-name",
"env": {
"AWS_REGION": "us-east-1",
"CLAUDE_CODE_USE_BEDROCK": "1"
}
}Replace awsAuthRefresh with the command you use to refresh credentials (aws login for the new auth flow, or omit if you use long-lived access keys).
Verify it works:
claude --version
# Start a session and run a quick test
claude
# Inside the session:
# What AWS region am I configured to use?Step 3 — Bootstrap project memory
Navigate to your code project and initialise CLAUDE.md:
cd your-project
claudeInside the session:
/initReview the generated CLAUDE.md, then add AWS-specific context:
## AWS Environment
- Region: us-east-1
- Deploy: cdk deploy (CDK TypeScript)
- Lambda runtime: python3.12, arm64
- Secrets: Secrets Manager onlyCommit the file so all teammates get the same context.
Step 4 — Add a .claudeignore
cat > .claudeignore << 'EOF'
node_modules/
dist/
cdk.out/
*.log
__pycache__/
.venv/
EOFStep 5 — Call Claude from a Lambda (test locally first)
Create test_bedrock.py:
import boto3, json
client = boto3.client('bedrock-runtime', region_name='us-east-1')
response = client.converse(
modelId='us.anthropic.claude-sonnet-4-5-20250929-v1:0',
messages=[{
'role': 'user',
'content': [{'text': 'Say hello and confirm you are running on Amazon Bedrock.'}]
}],
inferenceConfig={'maxTokens': 128}
)
print(response['output']['message']['content'][0]['text'])python3 test_bedrock.pyExpected output: a short greeting confirming Bedrock access. If you see AccessDeniedException, check that both bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream are in your IAM policy.
Step 6 — Create a minimal Step Functions workflow
Save workflow.json:
{
"Comment": "Hello Bedrock from Step Functions",
"StartAt": "AskClaude",
"States": {
"AskClaude": {
"Type": "Task",
"Resource": "arn:aws:states:::bedrock:invokeModel",
"Parameters": {
"ModelId": "us.anthropic.claude-sonnet-4-5-20250929-v1:0",
"Body": {
"anthropic_version": "bedrock-2023-05-31",
"messages": [{"role": "user", "content": "What is the capital of France? Answer in one word."}],
"max_tokens": 16
}
},
"End": true
}
}
}Create and run the state machine:
aws stepfunctions create-state-machine \
--name hello-bedrock \
--definition file://workflow.json \
--role-arn arn:aws:iam::YOUR_ACCOUNT:role/StepFunctionsBedrockRole \
--region us-east-1
aws stepfunctions start-execution \
--state-machine-arn arn:aws:states:us-east-1:YOUR_ACCOUNT:stateMachine:hello-bedrock \
--region us-east-1Check the execution result in the Step Functions console. You should see Claude’s response in the output of the AskClaude state.
Expected result: a working end-to-end path from developer laptop → Bedrock-powered Claude Code → production Lambda/Step Functions workflow, all governed by IAM and billed to a single AWS account. If any step fails, check IAM permissions first — they are the most common blocker at each layer.
Where to go next
- Watch the session recording from Code w/ Claude 2026 (London) — Antonio Rodriguez walks through a live Bedrock setup.
- Read the Claude Code on Amazon Bedrock quick setup guide on AWS Builder Center for the latest model IDs and auth options.
- Explore the aws-skills GitHub repo for community-built Agent Skills covering CDK, serverless, and Bedrock AgentCore.
- Continue with Mastering Claude Code to go deeper on CLAUDE.md, slash commands, and headless automation.