Portkey provides a robust and secure platform to observe, govern, and manage your locally or privately hosted custom models using Triton.
Integrating Custom Models with Portkey SDK
Expose your Triton Server
Expose your Triton server by using a tunneling service like ngrok or any other way you prefer. You can skip this step if you’re self-hosting the Gateway.
ngrok http 11434 --host-header="localhost:8080"
Install the Portkey SDK
npm install --save portkey-ai
npm install --save portkey-ai
Initialize Portkey with Triton custom URL
- Pass your publicly-exposed Triton server URL to Portkey with
customHost
- Set target
provider
as triton
.
import Portkey from 'portkey-ai'
const portkey = new Portkey({
apiKey: "PORTKEY_API_KEY",
provider: "triton",
customHost: "http://localhost:8000/v2/models/mymodel" // Your Triton Hosted URL
Authorization: "AUTH_KEY", // If you need to pass auth
})
import Portkey from 'portkey-ai'
const portkey = new Portkey({
apiKey: "PORTKEY_API_KEY",
provider: "triton",
customHost: "http://localhost:8000/v2/models/mymodel" // Your Triton Hosted URL
Authorization: "AUTH_KEY", // If you need to pass auth
})
from portkey_ai import Portkey
portkey = Portkey(
api_key="PORTKEY_API_KEY",
provider="triton",
custom_host="http://localhost:8000/v2/models/mymodel" # Your Triton Hosted URL
Authorization="AUTH_KEY", # If you need to pass auth
)
More on custom_host
here.
Invoke Chat Completions
Use the Portkey SDK to invoke chat completions (generate) from your model, just as you would with any other provider:
const chatCompletion = await portkey.chat.completions.create({
messages: [{ role: 'user', content: 'Say this is a test' }]
});
console.log(chatCompletion.choices);
const chatCompletion = await portkey.chat.completions.create({
messages: [{ role: 'user', content: 'Say this is a test' }]
});
console.log(chatCompletion.choices);
completion = portkey.chat.completions.create(
messages= [{ "role": 'user', "content": 'Say this is a test' }]
)
print(completion)
Next Steps
Explore the complete list of features supported in the SDK:
You’ll find more information in the relevant sections:
- Add metadata to your requests
- Add gateway configs to your requests
- Tracing requests
- Setup a fallback from triton to your local LLM