Use Portkey with AWS’s Strands Agents to take your AI Agents to production
Strands Agents is a simple-to-use agent framework built by AWS. Portkey enhances Strands Agents with production-grade observability, reliability, and multi-provider support—all through a single integration that requires no changes to your existing agent logic.
What you get with this integration:
Complete observability of every agent step, tool use, and LLM interaction
Built-in reliability with automatic fallbacks, retries, and load balancing
200+ LLMs accessible through the same OpenAI-compatible interface
Production monitoring with traces, logs, and real-time metrics
Zero code changes to your existing Strands agent implementations
The integration leverages Strands’ flexible client_args parameter, which passes any arguments directly to the OpenAI client constructor. By setting base_url to Portkey’s gateway, all requests route through Portkey while maintaining full compatibility with the OpenAI API.
# This is what happens under the hood in Strands:client_args = client_args or {}self.client = openai.OpenAI(**client_args) # Your Portkey config gets passed here
This means you get all of Portkey’s features without any changes to your agent logic, tool usage, or response handling.
Before using the integration, you need to configure your AI providers and create a Portkey API key.
1
Add Your AI Provider Keys
Go to Virtual Keys in the Portkey dashboard and add your actual AI provider keys (OpenAI, Anthropic, etc.). Each provider key gets a virtual key ID that you’ll reference in configs.
2
Create a Configuration
Go to Configs to define how requests should be routed. A basic config looks like:
{ "virtual_key": "openai-key-abc123"}
For production setups, you can add fallbacks, load balancing, and conditional routing here.
3
Generate Your Portkey API Key
Go to API Keys to create a new API key. Attach your config as the default routing config, and you’ll get an API key that routes to your configured providers.
Here’s a full example showing how to set up a Strands agent with Portkey integration:
from strands import Agentfrom strands.models.openai import OpenAIModelfrom strands_tools import calculator, web_searchfrom portkey_ai import PORTKEY_GATEWAY_URL# Initialize model through Portkeymodel = OpenAIModel( client_args={ "api_key": "YOUR_PORTKEY_API_KEY", "base_url": PORTKEY_GATEWAY_URL }, model_id="gpt-4o", params={ "max_tokens": 1000, "temperature": 0.7, })# Create agent with tools (unchanged from standard Strands usage)agent = Agent( model=model, tools=[calculator, web_search])# Use the agent (unchanged from standard Strands usage)response = agent("Calculate the compound interest on $10,000 at 5% for 10 years, then search for current inflation rates")print(response)
The agent will automatically use both tools as needed, and every step will be logged in your Portkey dashboard with full request/response details, timing, and token usage.
Portkey provides comprehensive visibility into your agent’s behavior without requiring any code changes.
Track the complete execution flow of your agents with hierarchical traces that show:
LLM calls: Every request to language models with full payloads
Tool invocations: Which tools were called, with what parameters, and their responses
Decision points: How the agent chose between different tools or approaches
Performance metrics: Latency, token usage, and cost for each step
from strands import Agentfrom strands.models.openai import OpenAIModelfrom strands_tools import calculatorfrom portkey_ai import PORTKEY_GATEWAY_URL, createHeadersmodel = OpenAIModel( client_args={ "api_key": "YOUR_PORTKEY_API_KEY", "base_url": PORTKEY_GATEWAY_URL, # Add trace ID to group related requests "default_headers": createHeaders(trace_id="user_session_123") }, model_id="gpt-4o", params={"temperature": 0.7})agent = Agent(model=model, tools=[calculator])response = agent("What's 15% of 2,847?")
All requests from this agent will be grouped under the same trace, making it easy to analyze the complete interaction flow.
Track the complete execution flow of your agents with hierarchical traces that show:
LLM calls: Every request to language models with full payloads
Tool invocations: Which tools were called, with what parameters, and their responses
Decision points: How the agent chose between different tools or approaches
Performance metrics: Latency, token usage, and cost for each step
from strands import Agentfrom strands.models.openai import OpenAIModelfrom strands_tools import calculatorfrom portkey_ai import PORTKEY_GATEWAY_URL, createHeadersmodel = OpenAIModel( client_args={ "api_key": "YOUR_PORTKEY_API_KEY", "base_url": PORTKEY_GATEWAY_URL, # Add trace ID to group related requests "default_headers": createHeaders(trace_id="user_session_123") }, model_id="gpt-4o", params={"temperature": 0.7})agent = Agent(model=model, tools=[calculator])response = agent("What's 15% of 2,847?")
All requests from this agent will be grouped under the same trace, making it easy to analyze the complete interaction flow.
Add business context to your agent runs for better filtering and analysis:
from portkey_ai import createHeadersmodel = OpenAIModel( client_args={ "api_key": "YOUR_PORTKEY_API_KEY", "base_url": PORTKEY_GATEWAY_URL, "default_headers": createHeaders( trace_id="customer_support_bot", metadata={ "agent_type": "customer_support", "user_tier": "premium", "session_id": "sess_789", "department": "billing" } ) }, model_id="gpt-4o", params={"temperature": 0.3} # Lower temperature for support tasks)
This metadata appears in your Portkey dashboard, allowing you to filter logs and analyze performance by user type, session, or any custom dimension.
Monitor your agents in production with built-in dashboards that track:
Success rates: Percentage of successful agent completions
Average latency: Response times across different agent types
Token usage: Track consumption and costs across models
Error patterns: Common failure modes and their frequency
All metrics can be segmented by the metadata you provide, giving you insights like “premium user agents have 15% higher success rates” or “billing department queries take 2x longer on average.”
When running agents in production, things can go wrong - API rate limits, network issues, or provider outages. Portkey’s reliability features ensure your agents keep running smoothly even when problems occur.
It’s simple to enable fallback in your Strands Agents by using a Portkey Config that you can attach at runtime or directly to your Portkey API key. Here’s an example of attaching a Config at runtime:
Configure multiple providers so your agents keep working even when one provider fails:
from portkey_ai import createHeadersmodel = OpenAIModel( client_args={ "api_key": "YOUR_PORTKEY_API_KEY", "base_url": PORTKEY_GATEWAY_URL, "default_headers": createHeaders( config={ "strategy": { "mode": "fallback", "on_status_codes": [429, 503, 502] # Rate limits and server errors }, "targets": [ { "virtual_key": "openai-key-primary" }, # Try OpenAI first { "virtual_key": "anthropic-key-backup" } # Fall back to Claude ] } ) }, model_id="gpt-4o", # Will map to equivalent models on each provider params={"temperature": 0.7})
If OpenAI returns a rate limit error (429), Portkey automatically retries the request with Anthropic’s Claude, using default model mappings.
Configure multiple providers so your agents keep working even when one provider fails:
from portkey_ai import createHeadersmodel = OpenAIModel( client_args={ "api_key": "YOUR_PORTKEY_API_KEY", "base_url": PORTKEY_GATEWAY_URL, "default_headers": createHeaders( config={ "strategy": { "mode": "fallback", "on_status_codes": [429, 503, 502] # Rate limits and server errors }, "targets": [ { "virtual_key": "openai-key-primary" }, # Try OpenAI first { "virtual_key": "anthropic-key-backup" } # Fall back to Claude ] } ) }, model_id="gpt-4o", # Will map to equivalent models on each provider params={"temperature": 0.7})
If OpenAI returns a rate limit error (429), Portkey automatically retries the request with Anthropic’s Claude, using default model mappings.
Distribute requests across multiple API keys to stay within rate limits:
Requests will be distributed 70/30 across your two OpenAI keys, helping you maximize throughput without hitting individual key limits.
Route requests to different providers/models based on custom logic (like metadata, input content, or user attributes) using Portkey’s Conditional Routing feature.
Access 1,600+ models through the same Strands interface by changing just the provider configuration:
from portkey_ai import createHeaders# Use Claude instead of GPT-4model = OpenAIModel( client_args={ "api_key": "YOUR_PORTKEY_API_KEY", "base_url": PORTKEY_GATEWAY_URL, "default_headers": createHeaders( provider="anthropic", api_key="YOUR_ANTHROPIC_KEY" # Can also use virtual keys ) }, model_id="claude-3-7-sonnet-latest", # Claude model ID params={"max_tokens": 1000, "temperature": 0.7})# Agent code remains exactly the sameagent = Agent(model=model, tools=[calculator])response = agent("Explain quantum computing in simple terms")
from portkey_ai import createHeaders# Use Claude instead of GPT-4model = OpenAIModel( client_args={ "api_key": "YOUR_PORTKEY_API_KEY", "base_url": PORTKEY_GATEWAY_URL, "default_headers": createHeaders( provider="anthropic", api_key="YOUR_ANTHROPIC_KEY" # Can also use virtual keys ) }, model_id="claude-3-7-sonnet-latest", # Claude model ID params={"max_tokens": 1000, "temperature": 0.7})# Agent code remains exactly the sameagent = Agent(model=model, tools=[calculator])response = agent("Explain quantum computing in simple terms")
# Create different model instances for different tasksreasoning_model = OpenAIModel( client_args={ "api_key": "YOUR_PORTKEY_API_KEY", "base_url": PORTKEY_GATEWAY_URL, "default_headers": createHeaders(virtual_key="openai-key") }, model_id="gpt-4o", params={"temperature": 0.1} # Low temperature for reasoning)creative_model = OpenAIModel( client_args={ "api_key": "YOUR_PORTKEY_API_KEY", "base_url": PORTKEY_GATEWAY_URL, "default_headers": createHeaders(virtual_key="gemini-key") }, model_id="gemini-2.0-flash-exp", params={"temperature": 0.8} # Higher temperature for creativity)# Use different models for different agent typesreasoning_agent = Agent(model=reasoning_model, tools=[calculator])creative_agent = Agent(model=creative_model, tools=[])
Portkey provides access to LLMs from providers including:
OpenAI (GPT-4o, GPT-4 Turbo, etc.)
Anthropic (Claude 3.5 Sonnet, Claude 3 Opus, etc.)
Solution: Verify your Portkey API key and provider setup: Verify your Portkey API key and provider setup. Test your Portkey API key directly and check for common issues such as wrong API key format, misconfigured provider virtual keys, and missing config attachments.
# Test your Portkey API key directlyfrom portkey_ai import Portkeyportkey = Portkey(api_key="YOUR_PORTKEY_API_KEY")response = portkey.chat.completions.create( messages=[{"role": "user", "content": "test"}], model="gpt-4o")print(response)
Problem: Hitting rate limits despite having fallbacks configured
Problem: Not seeing traces or logs in Portkey dashboard
Solution: Verify your requests are going through Portkey:
# Check that base_url is set correctlyprint(model.client.base_url) # Should be https://api.portkey.ai/v1# Add trace IDs for easier debuggingheaders = createHeaders( trace_id="debug-session-123", metadata={"debug": "true"})
Also check the Logs section in your Portkey dashboard and filter by your metadata.
Portkey adds production-readiness to Strands Agents through comprehensive observability (traces, logs, metrics), reliability features (fallbacks, retries, caching), and access to 200+ LLMs through a unified interface. This makes it easier to debug, optimize, and scale your agent applications, all while preserving Strands Agents’ strong type safety.
Yes! Portkey integrates seamlessly with existing Strands Agents applications. You just need to replace your client initialization code with the Portkey-enabled version. The rest of your agent code remains unchanged and continues to benefit from Strands Agents’ strong typing.
Portkey supports all Strands Agents features, including tool use, multi-agent systems, and more. It adds observability and reliability without limiting any of the framework’s functionality.
Yes, Portkey allows you to use a consistent trace_id across multiple agents and requests to track the entire workflow. This is especially useful for multi-agent systems where you want to understand the full execution path.
Portkey allows you to add custom metadata to your agent runs, which you can then use for filtering. Add fields like agent_name, agent_type, or session_id to easily find and analyze specific agent executions.
Yes! Portkey uses your own API keys for the various LLM providers. It securely stores them as virtual keys, allowing you to easily manage and rotate keys without changing your code.