Llama Prompt Ops is a Python package that automatically optimizes prompts for Llama models. It transforms prompts that work well with other LLMs into prompts that are optimized for Llama models, improving performance and reliability.
This guide shows you how to combine Llama Prompt Ops with Portkey to optimize prompts for Llama models using enterprise-grade LLM infrastructure. You’ll build a system that analyzes support messages to extract urgency, sentiment, and relevant service categories.
Learn More about Llama Prompt Ops on it’s Official Github
You can explore the complete dataset and prompt in the use-cases/facility-support-analyzer directory, which contains the sample data and system prompts used in this guide. By using Portkey, you’ll gain access to:
Portkey provides a drop-in replacement for LLM providers in the llama-prompt-ops workflow, requiring minimal configuration changes.
Before diving into the installation, let’s take a closer look at the components of this use case. You can find all the relevant files in the use-cases/facility-support-analyzer directory:
facility_prompt_sys.txt
- System prompt for the taskfacility_v2_test.json
- Dataset with customer service messagesfacility-simple.yaml
- Simple configuration fileeval.ipynb
- Evaluation notebookThe system prompt instructs the LLM to analyze customer service messages and extract structured information in JSON format:
The dataset consists of customer service messages in JSON format. Each entry contains:
Example entry:
The FacilityMetric evaluates the model’s outputs by comparing them to the ground truth answers. It checks:
The metric parses both the predicted and ground truth JSON outputs, compares them field by field, and calculates an overall score that reflects how well the model is performing on this task.
Before you begin, make sure you have:
Portkey allows you to use 1600+ LLMs with your llama-prompt-ops setup, with minimal configuration required. Let’s set up the core components in Portkey that you’ll need for integration.
Create Virtual Key
Virtual Keys are Portkey’s secure way to manage your LLM provider API keys. Think of them like disposable credit cards for your LLM API keys, providing essential controls like:
To create a virtual key: Go to Virtual Keys in the Portkey App. Save and copy the virtual key ID
Save your virtual key ID - you’ll need it for the next step.
Create Default Config
Configs in Portkey are JSON objects that define how your requests are routed. They help with implementing features like advanced routing, fallbacks, and retries.
We need to create a default config to route our requests to the virtual key created in Step 1.
To create your config:
This basic config connects to your virtual key. You can add more advanced portkey features later.
Configure Portkey API Key
Now create Portkey API key access point and attach the config you created in Step 2:
Step 2
Save your API key securely - you’ll need it for llama-prompt-ops integration.
With Portkey set up, we now need to configure llama-prompt-ops to use Portkey instead of OpenRouter.
If you haven’t already, create a sample project:
.env
FileReplace the OpenRouter API key with your Portkey API key:
Modify your config.yaml
file to use Portkey instead of OpenRouter:
Important Notes:
name
parameter can be set to openai/openai/gpt-4o
or any other model identifier, but the actual model selection is handled by your Portkey config.api_base
to "https://api.portkey.ai/v1"
.Now you’re ready to run llama-prompt-ops with Portkey:
The tool will run the optimization process using your Portkey configuration, which will:
The optimized prompt will be saved in the results/
directory with a filename like facility-simple_YYYYMMDD_HHMMSS.yaml
. When you open this file, you’ll see something like this:
Now that you have enterprise-grade llama-prompt-ops setup, let’s explore the comprehensive features Portkey provides to ensure secure, efficient, and cost-effective AI operations.
Using Portkey you can track 40+ key metrics including cost, token usage, response time, and performance across all your LLM providers in real time. You can also filter these metrics based on custom metadata that you can set in your configs. Learn more about metadata here.
Portkey’s logging dashboard provides detailed logs for every request made to your LLMs. These logs include:
You can easily switch between 1600+ LLMs. Call various LLMs such as Anthropic, Gemini, Mistral, Azure OpenAI, Google Vertex AI, AWS Bedrock, and many more by simply changing the virtual_key
in your default config
object.
Using Portkey, you can add custom metadata to your LLM requests for detailed tracking and analytics. Use metadata tags to filter logs, track usage, and attribute costs across departments and teams.
Set and manage spending limits across teams and departments. Control costs with granular budget limits and usage tracking.
Enterprise-grade SSO integration with support for SAML 2.0, Okta, Azure AD, and custom providers for secure authentication.
Hierarchical organization structure with workspaces, teams, and role-based access control for enterprise-scale deployments.
Comprehensive access control rules and detailed audit logging for security compliance and usage tracking.
Automatically switch to backup targets if the primary target fails.
Route requests to different targets based on specified conditions.
Distribute requests across multiple targets based on defined weights.
Enable caching of responses to improve performance and reduce costs.
Automatic retry handling with exponential backoff for failed requests
Set and manage budget limits across teams and departments. Control costs with granular budget limits and usage tracking.
Protect your Project’s data and enhance reliability with real-time checks on LLM inputs and outputs. Leverage guardrails to:
Implement real-time protection for your LLM interactions with automatic detection and filtering of sensitive content, PII, and custom security rules. Enable comprehensive data protection while maintaining compliance with organizational policies.
How do I update my Virtual Key limits after creation?
You can update your Virtual Key limits at any time from the Portkey dashboard:
Can I use multiple LLM providers with the same API key?
Yes! You can create multiple Virtual Keys (one for each provider) and attach them to a single config. This config can then be connected to your API key, allowing you to use multiple providers through a single API key.
How do I track costs for different teams?
Portkey provides several ways to track team costs:
What happens if a team exceeds their budget limit?
When a team reaches their budget limit:
How do I control which models are available in llama-prompt-ops workflows?
You can control model access by:
By integrating Portkey with llama-prompt-ops, you’ve unlocked access to a vast array of LLMs while gaining enterprise-grade controls, observability, and governance features. This setup provides both flexibility in model selection and robust infrastructure for production deployments.
Portkey’s unified API approach also means you can easily switch between different models without changing your code, making it simple to test and compare different LLMs for optimal performance with your prompt optimization workflows.