AI Learning
intermediate ⏱️ 14 min read · 🎬 ~19 min video

AI with Claude on AWS: From Code to Orchestration

Stand up Claude Code on Amazon Bedrock, teach it your team's conventions with CLAUDE.md and Agent Skills, then graduate to full multi-step orchestration with Lambda and Step Functions.

This lesson is original educational writing based on this video by Anthropic (published May 20, 2026). All credit for the original content goes to the creators.

#agents #claude-code #productivity
Video thumbnail: AI with Claude on AWS: From Code to Orchestration
Original video — all credit to the creators. Watch the original on YouTube ↗

1. Why Bedrock? The enterprise case for routing Claude through AWS

When you call claude from your terminal, requests go to Anthropic’s API by default. For personal or small-team use that is fine. For enterprise or regulated environments it often is not, for three reasons:

Data residency and security. Bedrock keeps inference traffic inside your AWS VPC. Data never leaves your account boundary, which satisfies most data-protection requirements without any custom proxy work.

IAM governance. Every Bedrock call is an IAM action. That means you get per-user CloudTrail audit logs, service-control policies, and cost allocation by team — the same controls you already apply to every other AWS service.

Unified billing. One AWS invoice. No separate Anthropic subscription to manage, no token quotas siloed from your cloud budget.

The trade-off is a few milliseconds of extra latency (typically 2–5 s for Sonnet on a simple query vs. the Anthropic API directly) and the overhead of managing AWS credentials. For most enterprise scenarios, that is an easy trade.

2. Switching Claude Code to Bedrock

Claude Code speaks to Bedrock through two environment variables. No code changes, no SDKs to swap.

export CLAUDE_CODE_USE_BEDROCK=1
export AWS_REGION=us-east-1

Or persist them in ~/.claude/settings.json so every session picks them up automatically:

{
  "awsAuthRefresh": "aws login",
  "env": {
    "AWS_REGION": "us-east-1",
    "CLAUDE_CODE_USE_BEDROCK": "1"
  }
}

awsAuthRefresh tells Claude Code to re-run aws login whenever credentials expire — useful with IAM Identity Center (SSO) sessions that rotate every few hours.

Authentication options

MethodWhen to use
aws login (AWS CLI v2.32+)Local development, interactive
aws sso login --profile <name>IAM Identity Center / federated identity
Instance profile / ECS task roleLambda functions, ECS containers, EC2
AWS_BEARER_TOKEN_BEDROCK env varQuick tests; no per-user CloudTrail audit

IAM permissions required

Your IAM principal needs at minimum:

{
  "Effect": "Allow",
  "Action": [
    "bedrock:InvokeModel",
    "bedrock:InvokeModelWithResponseStream"
  ],
  "Resource": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-*"
}

InvokeModelWithResponseStream is required for streaming responses. Claude Code will silently freeze without it.

Choosing a model tier

Use /model inside Claude Code to switch on the fly. For programmatic defaults:

{
  "env": {
    "ANTHROPIC_DEFAULT_SONNET_MODEL": "us.anthropic.claude-sonnet-4-5-20250929-v1:0",
    "ANTHROPIC_DEFAULT_HAIKU_MODEL":  "us.anthropic.claude-haiku-4-5-20251001-v1:0"
  }
}

Haiku is 10–12× cheaper than Sonnet and handles routine edits, linting, and short Q&A well. Sonnet is the default sweet spot. Opus for architecture decisions and complex multi-file refactors. A dual-model strategy — Haiku for background tasks, Sonnet for interactive work — can cut token costs by 40–60% in practice.

3. Team conventions: CLAUDE.md and Agent Skills

Switching to Bedrock gives you infrastructure. The next step is encoding your team’s knowledge so every developer benefits from the same institutional context.

CLAUDE.md at scale

Claude Code reads CLAUDE.md from the project root (and parent directories up to ~). For a team on Bedrock, a shared project CLAUDE.md becomes the single source of truth for how Claude should behave on this codebase:

# CLAUDE.md

## AWS Environment
- Region: us-east-1 (primary), us-west-2 (DR)
- Account IDs: prod=123456789012 / staging=234567890123
- Deploy via CDK: `cdk deploy --all`

## Conventions
- Lambda functions: Python 3.12, arm64, 512 MB default
- IaC: CDK TypeScript (no CloudFormation raw templates)
- Secrets: Secrets Manager only — never env vars in Lambda config
- Naming: `{team}-{service}-{env}` (e.g. platform-auth-prod)

## Must-not-touch
- Do not modify the shared VPC stack (vpc-stack.ts) without infra-team review

Commit this file. Every developer who clones the repo gets the same Claude context automatically, without any personal setup.

Agent Skills

Agent Skills are modular, reusable capability files that live in .claude/skills/ (or a shared package your team installs). Where CLAUDE.md holds project context, Skills hold domain expertise — deep knowledge of a particular technology or workflow packaged for reuse across projects.

A typical AWS team might maintain Skills for:

  • CDK patterns — preferred constructs, tagging standards, stack naming
  • Bedrock AgentCore — how to register, version and invoke agents
  • Serverless cost ops — how to profile Lambda cold starts, set memory correctly, use Graviton

Skills compose with CLAUDE.md: CLAUDE.md says what the project is, Skills provide expertise on how to build things correctly within it. Together they replace the implicit knowledge that usually lives in a senior engineer’s head.

Developerruns claudeClaude CodeCLAUDE.md contextAgent SkillsAmazon BedrockClaude modelsIAM + CloudTrailAWS AccountVPC · Cost Explorer · SCPspromptHTTPS/TLSgoverned by
How CLAUDE.md (project context), Agent Skills (domain expertise), and Bedrock (infrastructure) layer to give Claude a team-aware, enterprise-governed workspace.

4. Calling Claude from Lambda: the two APIs

Once Claude Code is handling developer workflows, the next step is putting Claude into production AWS workloads. Lambda is the most common entry point.

The Bedrock Converse API is model-agnostic: the same code works with Claude, Titan, Llama, and any other Bedrock model. Use it for new code.

import boto3, json

client = boto3.client('bedrock-runtime', region_name='us-east-1')

response = client.converse(
    modelId='anthropic.claude-sonnet-4-5-20250929-v1:0',
    messages=[{
        'role': 'user',
        'content': [{'text': 'Summarize this support ticket: ' + ticket_text}]
    }],
    inferenceConfig={'maxTokens': 512, 'temperature': 0.1}
)

answer = response['output']['message']['content'][0]['text']

The InvokeModel API (Claude-specific)

invoke_model gives you direct access to Claude’s native request format. Use it when you need features that haven’t landed in Converse yet (extended thinking, certain tool-use patterns).

response = client.invoke_model(
    modelId='anthropic.claude-sonnet-4-5-20250929-v1:0',
    body=json.dumps({
        'anthropic_version': 'bedrock-2023-05-31',
        'messages': [{'role': 'user', 'content': ticket_text}],
        'max_tokens': 512,
        'thinking': {'type': 'enabled', 'budget_tokens': 2000}
    })
)
result = json.loads(response['body'].read())

Lambda configuration for Bedrock calls

Key settings for a production Lambda that calls Bedrock:

SettingRecommended valueReason
RuntimePython 3.12, arm64Cost, performance
Memory512 MB (tune up if slow)Bedrock calls are I/O-bound, not CPU-bound
Timeout60–300 sLong reasoning chains can take 30+ s
Execution rolebedrock:InvokeModel + bedrock:InvokeModelWithResponseStreamMinimum required
Retry policyExponential backoff, max 3Bedrock throttles at burst limits

Check your understanding

2 questions · your answers are saved in this browser only

  1. 1. Which IAM permission is commonly missed and causes Claude Code to silently freeze when connected to Bedrock?

  2. 2. What is the primary advantage of the Bedrock Converse API over InvokeModel?

5. Orchestration with Step Functions

A single Lambda call handles stateless, single-turn Claude interactions well. When you need multi-step workflows — document ingestion, human review gates, parallel analysis branches, retry on failure — AWS Step Functions is the right tool.

Why Step Functions beats custom retry logic in Lambda

ConcernDIY in LambdaStep Functions
Retry on throttleCustom sleep loopBuilt-in Retry with exponential backoff
Parallel tasksThread pools, complexityParallel state, managed
Human approval gatePolling loop or SQSWait for task token built-in
Audit trailCloudWatch logsFull execution history in console
Timeout per stepOne Lambda timeoutPer-state timeouts

A minimal document-analysis state machine

{
  "Comment": "Analyse a document with Claude via Bedrock",
  "StartAt": "FetchDocument",
  "States": {
    "FetchDocument": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456:function:fetch-doc",
      "Next": "AnalyseWithClaude"
    },
    "AnalyseWithClaude": {
      "Type": "Task",
      "Resource": "arn:aws:states:::bedrock:invokeModel",
      "Parameters": {
        "ModelId": "anthropic.claude-sonnet-4-5-20250929-v1:0",
        "Body": {
          "anthropic_version": "bedrock-2023-05-31",
          "messages": [{"role": "user", "content.$": "States.Format('Analyse this document: {}', $.documentText)"}],
          "max_tokens": 1024
        }
      },
      "Retry": [{"ErrorEquals": ["States.TaskFailed"], "IntervalSeconds": 5, "MaxAttempts": 3, "BackoffRate": 2}],
      "Next": "StoreResult"
    },
    "StoreResult": {
      "Type": "Task",
      "Resource": "arn:aws:lambda:us-east-1:123456:function:store-result",
      "End": true
    }
  }
}

Step Functions has a native arn:aws:states:::bedrock:invokeModel integration — you can call Claude directly from a state definition without writing a Lambda wrapper for that step.

When to choose what

  • Single Lambda — one-turn response, user-facing latency matters, stateless
  • Lambda + Step Functions — multi-step pipeline, retries, parallel branches, human gates
  • Bedrock Agents — open-ended tool-use where the model itself decides which steps to run and in what order

Check your understanding

3 questions · your answers are saved in this browser only

  1. 1. You need Claude to analyse 200 documents in parallel and fan the results into a single summary. Which AWS pattern fits best?

  2. 2. Step Functions has a native integration resource for Bedrock (arn:aws:states:::bedrock:invokeModel). What does this mean in practice?

  3. 3. Which component is most appropriate when you need Claude to autonomously decide which tools to call and in what order across an open-ended task?

6. Cost control and production readiness

Running Claude at scale on Bedrock requires attention to a few cost and reliability levers.

Prompt caching

Bedrock enables prompt caching by default for Claude models. Repeated system prompts (CLAUDE.md content, tool definitions) can reduce input token costs by up to 90% when the same prefix is reused across calls in the same session.

.claudeignore

Just like .gitignore, .claudeignore tells Claude Code which files to skip when indexing a project. Excluding node_modules/, dist/, *.log, and large data files keeps context windows focused and tokens cheap.

# .claudeignore
node_modules/
dist/
cdk.out/
*.log
*.parquet
__pycache__/

CloudWatch alarms for token spend

Bedrock surfaces usage metrics in CloudWatch. Set an alarm on InvocationLatency and on estimated token spend (via Cost Explorer’s daily budget alerts) so runaway agent loops don’t generate surprise bills.

Dual-model strategy

Assign model tiers to task types in your team CLAUDE.md:

## Model selection policy
- Routine edits, lint fixes, docs: use Haiku (cheapest)
- Interactive coding sessions: use Sonnet (default)
- Architecture review, complex refactors: use Opus

Teams that enforce this consciously typically cut token costs 40–60% compared to always using Sonnet.

Build it yourself

Follow these exact steps to reproduce it yourself · estimated time: ~25 min

Prerequisites

  • AWS CLI v2.32+ installed and configured (`aws --version`)
  • Claude Code installed (`claude --version`)
  • An IAM user or role with `bedrock:InvokeModel` and `bedrock:InvokeModelWithResponseStream` permissions
  • Amazon Bedrock enabled in your chosen region (us-east-1 or us-west-2 recommended)
  • A code project to experiment in

Step 1 — Enable model access in Bedrock

Open the Amazon Bedrock console, go to Model access, and request access to the Claude models you plan to use (Sonnet is a good starting point). Access is usually granted within a few minutes.

Step 2 — Configure Claude Code for Bedrock

Create or update ~/.claude/settings.json:

{
  "awsAuthRefresh": "aws sso login --profile your-profile-name",
  "env": {
    "AWS_REGION": "us-east-1",
    "CLAUDE_CODE_USE_BEDROCK": "1"
  }
}

Replace awsAuthRefresh with the command you use to refresh credentials (aws login for the new auth flow, or omit if you use long-lived access keys).

Verify it works:

claude --version
# Start a session and run a quick test
claude
# Inside the session:
# What AWS region am I configured to use?

Step 3 — Bootstrap project memory

Navigate to your code project and initialise CLAUDE.md:

cd your-project
claude

Inside the session:

/init

Review the generated CLAUDE.md, then add AWS-specific context:

## AWS Environment
- Region: us-east-1
- Deploy: cdk deploy (CDK TypeScript)
- Lambda runtime: python3.12, arm64
- Secrets: Secrets Manager only

Commit the file so all teammates get the same context.

Step 4 — Add a .claudeignore

cat > .claudeignore << 'EOF'
node_modules/
dist/
cdk.out/
*.log
__pycache__/
.venv/
EOF

Step 5 — Call Claude from a Lambda (test locally first)

Create test_bedrock.py:

import boto3, json

client = boto3.client('bedrock-runtime', region_name='us-east-1')

response = client.converse(
    modelId='us.anthropic.claude-sonnet-4-5-20250929-v1:0',
    messages=[{
        'role': 'user',
        'content': [{'text': 'Say hello and confirm you are running on Amazon Bedrock.'}]
    }],
    inferenceConfig={'maxTokens': 128}
)

print(response['output']['message']['content'][0]['text'])
python3 test_bedrock.py

Expected output: a short greeting confirming Bedrock access. If you see AccessDeniedException, check that both bedrock:InvokeModel and bedrock:InvokeModelWithResponseStream are in your IAM policy.

Step 6 — Create a minimal Step Functions workflow

Save workflow.json:

{
  "Comment": "Hello Bedrock from Step Functions",
  "StartAt": "AskClaude",
  "States": {
    "AskClaude": {
      "Type": "Task",
      "Resource": "arn:aws:states:::bedrock:invokeModel",
      "Parameters": {
        "ModelId": "us.anthropic.claude-sonnet-4-5-20250929-v1:0",
        "Body": {
          "anthropic_version": "bedrock-2023-05-31",
          "messages": [{"role": "user", "content": "What is the capital of France? Answer in one word."}],
          "max_tokens": 16
        }
      },
      "End": true
    }
  }
}

Create and run the state machine:

aws stepfunctions create-state-machine \
  --name hello-bedrock \
  --definition file://workflow.json \
  --role-arn arn:aws:iam::YOUR_ACCOUNT:role/StepFunctionsBedrockRole \
  --region us-east-1

aws stepfunctions start-execution \
  --state-machine-arn arn:aws:states:us-east-1:YOUR_ACCOUNT:stateMachine:hello-bedrock \
  --region us-east-1

Check the execution result in the Step Functions console. You should see Claude’s response in the output of the AskClaude state.

Expected result: a working end-to-end path from developer laptop → Bedrock-powered Claude Code → production Lambda/Step Functions workflow, all governed by IAM and billed to a single AWS account. If any step fails, check IAM permissions first — they are the most common blocker at each layer.

Where to go next

Related lessons

intermediate 🎬 Anthropic · ~37 min

Stop Babysitting Your Agents: From Approval Mode to Orchestration

The workflows Claude Code engineers use to stop hand-holding their AI and start orchestrating it — permission architecture, verification-first design, parallel fanout, and headless automation.

#claude-code #agentic-coding #agents #productivity #multiagent
intermediate 🎬 Anthropic · ~32 min

What's New in Claude Code: Routines, Agent View, Auto Mode, and More

A tour of the latest Claude Code features: routines for async automation, agent view for managing parallel sessions, auto mode for safer delegation, hooks for deterministic control, and the redesigned desktop app.

#claude-code #productivity #agents #automation
advanced 🎬 Anthropic · ~9 min

Agent Battle: Build the Best Diamond-Mining Agent

An Anthropic workshop where participants build diamond-mining agents in 45 minutes and compete on a live leaderboard. Learn agent configuration, eval-driven improvement, and what separates winning architectures.

#agents #evaluation #claude-code