Integrate vLLM-hosted custom models with Portkey and take them to production
Portkey provides a robust and secure platform to observe, govern, and manage your locally or privately hosted custom models using vLLM.
Here’s a list of all model architectures supported on vLLM.
Expose your vLLM Server
Expose your vLLM server by using a tunneling service like ngrok or any other way you prefer. You can skip this step if you’re self-hosting the Gateway.
Install the Portkey SDK
Initialize Portkey with vLLM custom URL
customHost
(by default, vLLM is on http://localhost:8000/v1
)provider
as openai
since the server follows OpenAI API schema.More on custom_host
here.
Invoke Chat Completions
Use the Portkey SDK to invoke chat completions from your model, just as you would with any other provider:
Virtual Keys serve as Portkey’s unified authentication system for all LLM interactions, simplifying the use of multiple providers and Portkey features within your application. For self-hosted LLMs, you can configure custom authentication requirements including authorization keys, bearer tokens, or any other headers needed to access your model:
OpenAI
)Custom Host
fieldYou can now use this virtual key in your requests:
For more information about managing self-hosted LLMs with Portkey, see Bring Your Own LLM.
Explore the complete list of features supported in the SDK:
You’ll find more information in the relevant sections: