Hermes Agent is an open-source autonomous agent from Nous Research that runs on your server — not tethered to an IDE. It lives on CLI, Telegram, Discord, Slack, WhatsApp, and 10+ other platforms, with persistent memory, self-improving skills, and full terminal/browser control. Hermes works with any OpenAI-compatible endpoint, which makes Portkey a drop-in gateway. Route Hermes through Portkey to get full observability, cost tracking, budget controls, fallbacks, and access to 200+ models — without changing how you use Hermes.Documentation Index
Fetch the complete documentation index at: https://docs.portkey.ai/docs/llms.txt
Use this file to discover all available pages before exploring further.
Quick start
Install Hermes
Add a provider in Portkey
openai-prod or anthropic-prod.
Get your Portkey API key
Configure Hermes to use Portkey
~/.hermes/config.yaml to point the main model at Portkey’s OpenAI-compatible endpoint:YOUR_PORTKEY_API_KEYwith your Portkey API key@openai-prodwith your provider sluggpt-4owith any model that provider supports
@<provider-slug>/<model-name> — this maps directly to a provider integration in your Portkey workspace.Start chatting
Reliability
All reliability controls are configured through Portkey Configs and attached to a scoped API key. Hermes sends the key as a Bearer token — Portkey applies the logic server-side, so Hermes config stays simple. Create a Config at Configs, then attach the Config ID to a scoped API key at API Keys → Create Key → Advanced Settings. Use that scoped key asOPENAI_API_KEY in Hermes.
Fallbacks
Route to backup providers when the primary fails — critical for long-running autonomous Hermes sessions where a provider outage shouldn’t break the task:Load balancing
Distribute requests across multiple keys or providers:Caching
Reduce costs and latency for repeated queries (common with Hermes’s scheduled cron jobs):Retries
Automatically retry failed requests:Metadata for tracing
Attach metadata in the same Config to group and filter logs by session, host, environment, or user:Budget limits
Hermes runs unattended via cron, messaging gateways, and subagents — costs can spiral without controls. Set provider-level limits in Portkey:- Go to AI Providers → select your provider
- Click Budget & Limits
- Configure:
- Cost limit: e.g., $200/month
- Token limit: e.g., 10M tokens/week
- Rate limit: requests per minute

Switch models mid-session
Inside a Hermes chat, use/model to switch to any model available in your Portkey workspace:
Multiple Portkey providers
To keep multiple Portkey-backed routes distinct — say, a production key for your main agent and a separate key for subagents — define them as named custom providers in~/.hermes/config.yaml:
~/.hermes/.env:
3. Set Up Enterprise Governance
Why Enterprise Governance?- Cost Management: Controlling and tracking AI spending across teams
- Access Control: Managing team access and workspaces
- Usage Analytics: Understanding how AI is being used across the organization
- Security & Compliance: Maintaining enterprise security standards
- Reliability: Ensuring consistent service across all users
- Model Management: Managing what models are being used in your setup
Step 1: Implement Budget Controls & Rate Limits
Step 1: Implement Budget Controls & Rate Limits
Step 1: Implement Budget Controls & Rate Limits
Model Catalog enables you to have granular control over LLM access at the team/department level. This helps you:- Set up budget limits
- Prevent unexpected usage spikes using Rate limits
- Track departmental spending
Setting Up Department-Specific Controls:
- Navigate to Model Catalog in Portkey dashboard
- Create new Provider for each engineering team with budget limits and rate limits
- Configure department-specific limits

Step 2: Define Model Access Rules
Step 2: Define Model Access Rules
Step 2: Define Model Access Rules
As your AI usage scales, controlling which teams can access specific models becomes crucial. You can simply manage AI models in your org by provisioning model at the top integration level.
Step 4: Set Routing Configuration
Step 4: Set Routing Configuration
- Data Protection: Implement guardrails for sensitive code and data
- Reliability Controls: Add fallbacks, load-balance, retry and smart conditional routing logic
- Caching: Implement Simple and Semantic Caching. and more…
Example Configuration:
Here’s a basic configuration to load-balance requests to OpenAI and Anthropic:Step 4: Implement Access Controls
Step 4: Implement Access Controls
Step 3: Implement Access Controls
Create User-specific API keys that automatically:- Track usage per developer/team with the help of metadata
- Apply appropriate configs to route requests
- Collect relevant metadata to filter logs
- Enforce access permissions
Step 5: Deploy & Monitor
Step 5: Deploy & Monitor
Step 4: Deploy & Monitor
After distributing API keys to your engineering teams, your enterprise-ready setup is ready to go. Each developer can now use their designated API keys with appropriate access levels and budget controls. Apply your governance setup using the integration steps from earlier sections Monitor usage in Portkey dashboard:- Cost tracking by engineering team
- Model usage patterns for AI agent tasks
- Request volumes
- Error rates and debugging logs
Enterprise Features Now Available
You now have:- Departmental budget controls
- Model access governance
- Usage tracking & attribution
- Security guardrails
- Reliability features
Portkey Features
Now that you have an enterprise-grade setup, let’s explore the comprehensive features Portkey provides to ensure secure, efficient, and cost-effective AI operations.1. Comprehensive Metrics
Using Portkey you can track 40+ key metrics including cost, token usage, response time, and performance across all your LLM providers in real time. You can also filter these metrics based on custom metadata that you can set in your configs. Learn more about metadata here.
2. Advanced Logs
Portkey’s logging dashboard provides detailed logs for every request made to your LLMs. These logs include:- Complete request and response tracking
- Metadata tags for filtering
- Cost attribution and much more…

3. Unified Access to 1600+ LLMs
You can easily switch between 1600+ LLMs. Call various LLMs such as Anthropic, Gemini, Mistral, Azure OpenAI, Google Vertex AI, AWS Bedrock, and many more by simply changing theprovider slug in your default config object.
4. Advanced Metadata Tracking
Using Portkey, you can add custom metadata to your LLM requests for detailed tracking and analytics. Use metadata tags to filter logs, track usage, and attribute costs across departments and teams.Custom Metata
5. Enterprise Access Management
Budget Controls
Single Sign-On (SSO)
Organization Management
Access Rules & Audit Logs
6. Reliability Features
Fallbacks
Conditional Routing
Load Balancing
Caching
Smart Retries
Budget Limits
7. Advanced Guardrails
Protect your Project’s data and enhance reliability with real-time checks on LLM inputs and outputs. Leverage guardrails to:- Prevent sensitive data leaks
- Enforce compliance with organizational policies
- PII detection and masking
- Content filtering
- Custom security rules
- Data compliance checks
Guardrails
FAQs
How do I update my AI Provider limits after creation?
How do I update my AI Provider limits after creation?
Can I use multiple LLM providers with the same API key?
Can I use multiple LLM providers with the same API key?
How do I track costs for different teams?
How do I track costs for different teams?
- Create separate AI Providers for each team
- Use metadata tags in your configs
- Set up team-specific API keys
- Monitor usage in the analytics dashboard
What happens if a team exceeds their budget limit?
What happens if a team exceeds their budget limit?
- Further requests will be blocked
- Team admins receive notifications
- Usage statistics remain available in dashboard
- Limits can be adjusted if needed

