vLLM

On this page

Integrating Custom Models with Portkey SDK
Using Virtual Keys
Next Steps

Portkey provides a robust and secure platform to observe, govern, and manage your locally or privately hosted custom models using vLLM.

Here’s a list of all model architectures supported on vLLM.

Integrating Custom Models with Portkey SDK

Expose your vLLM Server

Expose your vLLM server by using a tunneling service like ngrok or any other way you prefer. You can skip this step if you’re self-hosting the Gateway.

ngrok http 11434 --host-header="localhost:8080"

Install the Portkey SDK

npm install --save portkey-ai

Initialize Portkey with vLLM custom URL

Pass your publicly-exposed vLLM server URL to Portkey with customHost (by default, vLLM is on http://localhost:8000/v1)
Set target provider as openai since the server follows OpenAI API schema.

import Portkey from 'portkey-ai'

const portkey = new Portkey({
    apiKey: "PORTKEY_API_KEY",
    provider: "openai",
    customHost: "https://7cc4-3-235-157-146.ngrok-free.app" // Your vLLM ngrok URL
    Authorization: "AUTH_KEY", // If you need to pass auth
})

Using Virtual Keys

Virtual Keys serve as Portkey’s unified authentication system for all LLM interactions, simplifying the use of multiple providers and Portkey features within your application. For self-hosted LLMs, you can configure custom authentication requirements including authorization keys, bearer tokens, or any other headers needed to access your model:

Navigate to Virtual Keys in your Portkey dashboard
Click “Add Key” and enable the “Local/Privately hosted provider” toggle
Configure your deployment:
- Select the matching provider API specification (typically OpenAI)
- Enter your model’s base URL in the Custom Host field
- Add required authentication headers and their values
Click “Create” to generate your virtual key

You can now use this virtual key in your requests:

const portkey = new Portkey({
    apiKey: "PORTKEY_API_KEY",
    virtualKey: "YOUR_SELF_HOSTED_LLM_VIRTUAL_KEY"

async function main() {
  const response = await client.chat.completions.create({
    messages: [{ role: "user", content: "Bob the builder.." }],
    model: "your-self-hosted-model-name",
  });

console.log(response.choices[0].message.content);
})

For more information about managing self-hosted LLMs with Portkey, see Bring Your Own LLM.

Next Steps

Explore the complete list of features supported in the SDK:

SDK

You’ll find more information in the relevant sections:

LocalAI Triton

Ecosystem

LLM Integrations

Cloud Platforms

Guardrails

Plugins

Vector Databases

Agents

AI Apps

Libraries

Tracing Providers

Integrating Custom Models with Portkey SDK

Using Virtual Keys

Next Steps

SDK

Ecosystem

LLM Integrations

Cloud Platforms

Guardrails

Plugins

Vector Databases

Agents

AI Apps

Libraries

Tracing Providers

​Integrating Custom Models with Portkey SDK

​Using Virtual Keys

​Next Steps

SDK

Integrating Custom Models with Portkey SDK

Using Virtual Keys

Next Steps