Integrate DSPy with Portkey for production-ready LLM pipelines
DSPy is a framework for algorithmically optimizing language model prompts and weights.
Portkey’s integration with DSPy makes your DSPy pipelines production-ready with detailed insights on costs & performance metrics for each run, and also makes your existing DSPy code work across 250+ LLMs.
Portkey extends the existing OpenAI client in DSPy and makes it work with 250+ LLMs and gives you detailed cost insights. Just change api_base and add Portkey related headers in the default_headers param.
import osfrom portkey_ai import PORTKEY_GATEWAY_URL, createHeadersimport dspy# Set up your Portkey clientturbo = dspy.OpenAI( api_base=PORTKEY_GATEWAY_URL + "/", model='gpt-4o', max_tokens=250, api_key="YOUR_OPENAI_API_KEY", # Enter Your OpenAI key model_type="chat", default_headers=createHeaders( api_key="YOUR_PORTKEY_API_KEY", # Enter Your Portkey API Key metadata={'_user': "dspy"}, provider="openai" ))# Configure DSPy to use the Portkey-enabled clientdspy.settings.configure(lm=turbo)
🎉 Voila! that’s all you need to do integrate Portkey with DSPy. Let’s try making our first request.
Here’s a simple Google Colab notebook that demonstrates DSPy with Portkey integration
import dspy# Set up the Portkey-enabled client (as shown in the Getting Started section)class QA(dspy.Signature): """Given the question, generate the answer""" question = dspy.InputField(desc="User's question") answer = dspy.OutputField(desc="often between 1 and 3 words")dspy.settings.configure(lm=turbo)predict = dspy.Predict(QA)# Make a predictionprediction = predict(question="Who won the Golden Boot in the 2022 FIFA World Cup?")print(prediction.answer)
When you make a request using Portkey with DSPy, you can view detailed information about the request in the Portkey dashboard. Here’s what you’ll see:
Request Details: Information about the specific request, including the model used, input, and output.
Metrics: Performance metrics such as latency, token usage, and cost.
Logs: Detailed logs of the request, including any errors or warnings.
Traces: A visual representation of the request flow, especially useful for complex DSPy modules.
Portkey’s Unified API enables you to easily switch between 250+ language models. This includes the LLMs that are not natively integrated with DSPy. Here’s how you can modify your DSPy setup to use Claude from Gpt-4 model:
# OpenAI setupturbo = dspy.OpenAI( api_base=PORTKEY_GATEWAY_URL + "/", model='gpt-4o', api_key="YOUR_OPENAI_API_KEY", # Enter your OpenAI API key model_type="chat", default_headers=createHeaders( api_key="YOUR_PORTKEY_API_KEY", metadata={'_user': "dspy"}, provider="openai" ))dspy.settings.configure(lm=turbo)# Anthropic setupturbo = dspy.OpenAI( api_base=PORTKEY_GATEWAY_URL + "/", model='claude-3-opus-20240229', # Change the model name from Gpt-4 to claude api_key="YOUR_Anthropic_API_KEY", # Enter your Anthropic API key model_type="chat", default_headers=createHeaders( api_key="YOUR_PORTKEY_API_KEY", metadata={'_user': "dspy"}, # Enter any key-value pair for filtering logs trace_id="test_dspy_trace", provider="anthropic" # Change your provider, you can find the provider slug in Portkey's docs ))dspy.settings.configure(lm=turbo)
Portkey provides detailed tracing for each request. This is especially useful for complex DSPy modules with multiple LLM calls. You can view these traces in the Portkey dashboard to understand the flow of your DSPy application.
Portkey’s Observability suite helps you track key metrics like cost and token usage, which is crucial for managing the high cost of DSPy. The observability dashboard helps you track 40+ key metrics, giving you detailed insights into your DSPy run.
Caching can significantly reduce these costs by storing frequently used data and responses. While DSPy has built-in simple caching, Portkey also offers advanced semantic caching to help you save more time and money.
Just modify your Portkey config as shown below and pass it with the config key in the default_headers param:
config={ "cache": { "mode": "semantic" } }turbo = dspy.OpenAI( api_base=PORTKEY_GATEWAY_URL + "/", model='gpt-4o', api_key="YOUR_OPENAI_API_KEY", # Enter your OpenAI API key model_type="chat", default_headers=createHeaders( api_key="YOUR_PORTKEY_API_KEY", metadata={'_user': "dspy"}, provider="openai", config=config ))dspy.settings.configure(lm=turbo)
Portkey offers built-in fallbacks between different LLMs or providers, load-balancing across multiple instances or API keys, and implementing automatic retries and request timeouts. This makes your DSPy more reliable and resilient.
Similar to caching example above, just define your Config and pass it with the Config key in the default_headers param.
DSPy uses caching for LLM calls by default, which means repeated identical requests won’t generate new API calls or new traces in Langtrace. To ensure you capture every LLM call, follow these steps:
Disable Caching: For full tracing during debugging, turn off DSPy’s caching. Check the DSPy documentation for detailed instructions on how to disable caching.
Use Unique Inputs: To test effectively, make sure each run uses different inputs to avoid triggering the cache.
Clear the Cache: If you need to test the same inputs again, clear DSPy’s cache between runs to ensure fresh API requests.
Verify Configuration: Confirm that your DSPy setup is correctly configured to use the intended LLM provider.
If you still face issues after following these steps, please reach out to our support team for additional help.
Remember to manage caching wisely in production to strike the right balance between thorough tracing and performance efficiency.