Run batch inference with Portkey
Portkey supports batching requests in two ways:
Portkey supports batching requests directly with Provider’s batch API by using a unified api structure. The unified api structure is similar to OpenAI’s batch API. Please refer to individual provider’s documentation for more details if additional headers or parameters are required.
Provider | Supported Endpoints |
---|---|
OpenAI | completions , chat completions , embedding |
Bedrock | chat completions |
Azure OpenAI | completions , chat completions , embedding |
Vertex | embedding , chat completions |
Portkey supports custom batching with Portkey’s batch API in two ways.
This is controlled by completion_window
parameter in the request.
completion_window
is set to 24h
, Portkey will batch requests with Provider’s batch API using Portkey’s file.completion_window
is set to immediate
, Portkey will batch requests directly with Portkey’s gateway.Along with this, you have to set portkey_options
which helps Portkey to batch requests to Provider’s batch API or Gateway.
portkey_options
to {"x-portkey-provider": "openai-virtual_key"}
completion_window
needs to be set to 24h
and input_file_id
needs to be Portkey’s file id. Please refer to Portkey’s files for more details.
GET /batches/<batch_id>/output
endpoint.completion_window
needs to be set to immediate
and input_file_id
needs to be Portkey’s file id. Please refer to Portkey’s files for more details.
Using this method, you can batch requests to any provider, whether they support batching or not.
x-portkey-config
header with retry)