Lepton AI

Quick Start
Add Provider in Model Catalog
Lepton AI Capabilities
Chat Completions
Speech-to-Text
Streaming
Supported Models
Next Steps

Quick Start

Get started with Lepton AI in under 2 minutes:

from portkey_ai import Portkey

# 1. Install: pip install portkey-ai
# 2. Add @lepton provider in model catalog
# 3. Use it:

portkey = Portkey(api_key="PORTKEY_API_KEY")

response = portkey.chat.completions.create(
    model="@lepton/llama-3.1-8b",
    messages=[{"role": "user", "content": "Hello!"}]
)

print(response.choices[0].message.content)

Add Provider in Model Catalog

Before making requests, add Lepton AI to your Model Catalog:

Go to Model Catalog → Add Provider
Select Lepton AI
Enter your Lepton API key
Name your provider (e.g., lepton)

Complete Setup Guide

See all setup options and detailed configuration instructions

Lepton AI Capabilities

Chat Completions

Generate chat completions with Lepton’s serverless models:

from portkey_ai import Portkey

portkey = Portkey(api_key="PORTKEY_API_KEY", provider="@lepton")

response = portkey.chat.completions.create(
    model="llama-3.1-8b",
    messages=[{"role": "user", "content": "Write a haiku about AI"}]
)

print(response.choices[0].message.content)

Speech-to-Text

Transcribe audio using Lepton’s Whisper models:

from portkey_ai import Portkey

portkey = Portkey(api_key="PORTKEY_API_KEY", provider="@lepton")

with open("audio.mp3", "rb") as audio_file:
    transcription = portkey.audio.transcriptions.create(
        file=audio_file,
        model="whisper-large-v3"
    )

print(transcription.text)

Streaming

Enable streaming for real-time responses:

from portkey_ai import Portkey

portkey = Portkey(api_key="PORTKEY_API_KEY", provider="@lepton")

stream = portkey.chat.completions.create(
    model="llama-3.1-8b",
    messages=[{"role": "user", "content": "Write a story about a robot"}],
    stream=True
)

for chunk in stream:
    print(chunk.choices[0].delta.content, end="")

Supported Models

Lepton AI provides serverless access to various models:

Model	Description
llama-3.1-8b	Llama 3.1 8B model
llama-3-8b-sft-v1	Fine-tuned Llama 3
whisper-large-v3	Speech-to-text

Check Lepton’s documentation for the complete model list.

Next Steps

Gateway Configs

Add fallbacks, load balancing, and more

Observability

Monitor and trace your Lepton requests

Prompt Library

Manage and version your prompts

Metadata

Add custom metadata to requests

For complete SDK documentation:

SDK Reference

Complete Portkey SDK documentation

Last modified on February 9, 2026

Lemonfox-AI Lingyi (01.ai)

⌘I

Ecosystem

LLM Integrations

Cloud Platforms

Guardrails

Plugins

Vector Databases

Agents

AI Apps

Libraries

Tracing Providers

MCP Clients

MCP Servers

Quick Start

Add Provider in Model Catalog

Complete Setup Guide

Lepton AI Capabilities

Chat Completions

Speech-to-Text

Streaming

Supported Models

Next Steps

Gateway Configs

Observability

Prompt Library

Metadata

SDK Reference

Ecosystem

LLM Integrations

Cloud Platforms

Guardrails

Plugins

Vector Databases

Agents

AI Apps

Libraries

Tracing Providers

MCP Clients

MCP Servers

​Quick Start

​Add Provider in Model Catalog

Complete Setup Guide

​Lepton AI Capabilities

​Chat Completions

​Speech-to-Text

​Streaming

​Supported Models

​Next Steps

Gateway Configs

Observability

Prompt Library

Metadata

SDK Reference

Quick Start

Add Provider in Model Catalog

Lepton AI Capabilities

Chat Completions

Speech-to-Text

Streaming

Supported Models

Next Steps