Here’s the official Triton Inference Server documentation for more details.
Integrating Custom Models with Portkey SDK
1
Expose your Triton Server
Expose your Triton server by using a tunneling service like ngrok or any other way you prefer. You can skip this step if you’re self-hosting the Gateway.
2
Install the Portkey SDK
3
Initialize Portkey with Triton custom URL
- Pass your publicly-exposed Triton server URL to Portkey with
customHost
- Set target
provider
astriton
.
custom_host
here.4
Invoke Chat Completions
Use the Portkey SDK to invoke chat completions (generate) from your model, just as you would with any other provider:
Next Steps
Explore the complete list of features supported in the SDK:SDK
You’ll find more information in the relevant sections: