LLM APIs often have inexplicable failures. With Portkey, you can rescue a substantial number of your requests with our in-built automatic retries feature.
This feature is available on all Portkey plans.
Retry-After
headers for rate-limited requestsTo enable retry, just add the retry
param to your config object.
By default, Portkey triggers retries on the following error codes: [429, 500, 502, 503, 504]
You can change this behaviour by setting the optional on_status_codes
param in your retry config and manually inputting the error codes on which retry will be triggered.
If the on_status_codes
param is present, retries will be triggered only on the error codes specified in that Config and not on Portkey’s default error codes for retries (i.e. [429, 500, 502, 503, 504])
Portkey can respect the provider’s retry-after-ms
, x-ms-retry-after-ms
and retry-after
response headers when encountering rate limits. This enables more intelligent retry timing based on the provider’s response headers rather than using the default exponential backoff strategy.
To enable this feature, add the use_retry_after_headers
parameter to your retry config. By default this behaviour is disabled, and use_retry_after_headers
is set to false
.
When use_retry_after_headers
is set to true
and the provider includes Retry-After
or Retry-After-ms
headers in their response, Portkey will use these values to determine the wait time before the next retry attempt, overriding the exponential backoff strategy.
If the provider doesn’t include these headers in the response, Portkey will fall back to the standard exponential backoff strategy.
The cumulative retry wait time for a single request is capped at 60 seconds. For example, if the first retry has a wait time of 20 seconds, and the second retry response includes a Retry-After value of 50 seconds, the request will fail since the total wait time (20+50=70) exceeds the 60-second cap. Similarly, if any single Retry-After value exceeds 60 seconds, the request will fail immediately.
When not using provider retry headers (or when they’re not available), Portkey triggers retries following this exponential backoff strategy:
Attempt | Time out between requests |
---|---|
Initial Call | Immediately |
Retry 1st attempt | 1 second |
Retry 2nd attempt | 2 seconds |
Retry 3rd attempt | 4 seconds |
Retry 4th attempt | 8 seconds |
Retry 5th attempt | 16 seconds |
This feature is available on all Portkey plans.
To enable retry, just add the retry
param to your config object.
By default, Portkey triggers retries on the following error codes: [429, 500, 502, 503, 504]
You can change this behaviour by setting the optional on_status_codes
param in your retry config and manually inputting the error codes on which rety will be triggered.
If the on_status_codes
param is present, retries will be triggered only on the error codes specified in that Config and not on Portkey’s default error codes for retries (i.e. [429, 500, 502, 503, 504])
Here’s how Portkey triggers retries following exponential backoff:
Attempt | Time out between requests |
---|---|
Initial Call | Immediately |
Retry 1st attempt | 1 second |
Retry 2nd attempt | 2 seconds |
Retry 3rd attempt | 4 seconds |
Retry 4th attempt | 8 seconds |
Retry 5th attempt | 16 seconds |
In the response from Portkey, you can find the x-portkey-retry-attempt-count
header which provides information about retry attempts:
-1
: This means that Portkey exhausted all the retry attempts and the request was unsuccessful0
: This means that there were no retries configured>0
: This means that Portkey attempted retries and this is the exact number at which the request was successfulCurrently, Portkey does not log all the retry attempts individually in the logs dashboard. Instead, the response times from all retry attempts are summed up in the single log entry.