Available on Enterprise plan and select Pro customers.
support@portkey.ai
if you would like to enable it for your org.
Setting Rate Limits on Providers
When creating a new provider on Portkey, you can set rate limits in two ways:Request-Based Limits
Set a maximum number of requests that can be made within a specified time period (per minute, hour, or day).Token-Based Limits
Set a maximum number of tokens that can be consumed within a specified time period (per minute, hour, or day).
Key Considerations
- Rate limits can be set as either request-based or token-based
- Time intervals can be configured as per minute, per hour, or per day
- Setting the limit to 0 disables the provider
- Rate limits apply immediately after being set
- Once set, rate limits cannot be edited by any organization member
- Rate limits work for all providers available on Portkey and apply to all organization members who use the provider
- After a rate limit is reached, requests will be rejected until the time period resets
Rate Limit Intervals
You can choose from three different time intervals for your rate limits:- Per Minute: Limits reset every minute, ideal for fine-grained control
- Per Hour: Limits reset hourly, providing balanced usage control
- Per Day: Limits reset daily, suitable for broader usage patterns
Exceeding Rate Limits
When a rate limit is reached:- Subsequent requests are rejected with a specific error code
- Error messages clearly indicate that the rate limit has been exceeded
- The limit automatically resets after the specified time period has elapsed
Editing Rate Limits
If you need to change or update a rate limit, you can duplicate the existing provider and create a new one with the desired limit.Monitoring Your Usage
You can track your request and token usage for any specific provider by navigating to the Analytics tab and filtering by the desired provider and timeframe.Use Cases for Rate Limits
- Cost Control: Prevent unexpected usage spikes that could lead to high costs
- Performance Management: Ensure your application maintains consistent performance
- Fairness: Distribute API access fairly across teams or users
- Security: Mitigate potential abuse or DoS attacks
- Provider Compliance: Stay within the rate limits imposed by underlying AI providers