Google Gemini

Portkey provides a robust and secure gateway to facilitate the integration of various Large Language Models (LLMs) into your applications, including Google Gemini APIs. With Portkey, you can take advantage of features like fast AI gateway access, observability, prompt management, and more, all while ensuring the secure management of your LLM API keys through a virtual key system.

Provider Slug. google

Portkey SDK Integration with Google Gemini Models

Portkey provides a consistent API to interact with models from various providers. To integrate Google Gemini with Portkey:

1. Install the Portkey SDK

Add the Portkey SDK to your application to interact with Google Gemini’s API through Portkey’s gateway.

NodeJS
Python

npm install --save portkey-ai

pip install portkey-ai

2. Initialize Portkey with the Virtual Key

To use Gemini with Portkey, get your API key from here, then add it to Portkey to create the virtual key.

NodeJS SDK
Python SDK

import Portkey from 'portkey-ai'

const portkey = new Portkey({
    apiKey: "PORTKEY_API_KEY", // defaults to process.env["PORTKEY_API_KEY"]
    virtualKey: "VIRTUAL_KEY" // Your Google Virtual Key
})

from portkey_ai import Portkey

portkey = Portkey(
    api_key="PORTKEY_API_KEY",  # Replace with your Portkey API key
    virtual_key="VIRTUAL_KEY"   # Replace with your virtual key for Google
)

3. Invoke Chat Completions with Google Gemini

Use the Portkey instance to send requests to Google Gemini. You can also override the virtual key directly in the API call if needed.

NodeJS SDK
Python SDK

const chatCompletion = await portkey.chat.completions.create({
    messages: [
        { role: 'system', content: 'You are not a helpful assistant' },
        { role: 'user', content: 'Say this is a test' }
    ],
    model: 'gemini-1.5-pro',
});

console.log(chatCompletion.choices);

completion = portkey.chat.completions.create(
    messages= [
        { "role": 'system', "content": 'You are not a helpful assistant' },
        { "role": 'user', "content": 'Say this is a test' }
    ],
    model= 'gemini-1.5-pro'
)

print(completion)

Portkey supports the system_instructions parameter for Google Gemini 1.5 - allowing you to control the behavior and output of your Gemini-powered applications with ease.Simply include your Gemini system prompt as part of the {"role":"system"} message within the messages array of your request body. Portkey Gateway will automatically transform your message to ensure seamless compatibility with the Google Gemini API.

Function Calling

Portkey supports function calling mode on Google’s Gemini Models. Explore this Cookbook for a deep dive and examples: Function Calling

Document, Video, Audio Processing with Gemini

Gemini supports attaching mp4, pdf, jpg, mp3, wav, etc. file types to your messages.

Gemini Docs:

Using Portkey, here’s how you can send these media files:

const chatCompletion = await portkey.chat.completions.create({
    messages: [
        { role: 'system', content: 'You are a helpful assistant' },
        { role: 'user', content: [
            {
                type: 'image_url',
                image_url: {
                    url: 'gs://cloud-samples-data/generative-ai/image/scones.jpg'
                }
            },
            {
                type: 'text',
                text: 'Describe the image'
            }
        ]}
    ],
    model: 'gemini-1.5-pro',
    max_tokens: 200
});

This same message format also works for all other media types — just send your media file in the url field, like "url": "gs://cloud-samples-data/video/animals.mp4".

Your URL should have the file extension, this is used for inferring MIME_TYPE which is a required parameter for prompting Gemini models with files.

Sending base64 Image

Here, you can send the base64 image data along with the url field too:

"url": "data:image/png;base64,UklGRkacAABXRUJQVlA4IDqcAAC....."

Grounding with Google Search

Vertex AI supports grounding with Google Search. This is a feature that allows you to ground your LLM responses with real-time search results. Grounding is invoked by passing the google_search tool (for newer models like gemini-2.0-flash-001), and google_search_retrieval (for older models like gemini-1.5-flash) in the tools array.

"tools": [
    {
        "type": "function",
        "function": {
            "name": "google_search" // or google_search_retrieval for older models
        }
    }]

If you mix regular tools with grounding tools, vertex might throw an error saying only one tool can be used at a time.

Extended Thinking (Reasoning Models) (Beta)

The assistants thinking response is returned in the response_chunk.choices[0].delta.content_blocks array, not the response.choices[0].message.content string.

Models like gemini-2.5-flash-preview-04-17 gemini-2.5-flash-preview-04-17 support extended thinking. This is similar to openai thinking, but you get the model’s reasoning as it processes the request as well. Note that you will have to set strict_open_ai_compliance=False in the headers to use this feature.

Single turn conversation

from portkey_ai import Portkey

# Initialize the Portkey client
portkey = Portkey(
    api_key="PORTKEY_API_KEY",  # Replace with your Portkey API key
    virtual_key="VIRTUAL_KEY",   # Add your provider's virtual key
    strict_open_ai_compliance=False
)

# Create the request
response = portkey.chat.completions.create(
  model="gemini-2.5-flash-preview-04-17",
  max_tokens=3000,
  thinking={
      "type": "enabled",
      "budget_tokens": 2030
  },
  stream=True,
  messages=[
      {
          "role": "user",
          "content": [
              {
                  "type": "text",
                  "text": "when does the flight from new york to bengaluru land tomorrow, what time, what is its flight number, and what is its baggage belt?"
              }
          ]
      }
  ]
)
print(response)
# in case of streaming responses you'd have to parse the response_chunk.choices[0].delta.content_blocks array
# response = portkey.chat.completions.create(
#   ...same config as above but with stream: true
# )
# for chunk in response:
#     if chunk.choices[0].delta:
#         content_blocks = chunk.choices[0].delta.get("content_blocks")
#         if content_blocks is not None:
#             for content_block in content_blocks:
#                 print(content_block)

To disable thinking for gemini models like gemini-2.5-flash-preview-04-17, you are required to explicitly set budget_tokens to 0.

"thinking": {
    "type": "enabled",
    "budget_tokens": 0
}

Gemini grounding mode may not work via Portkey SDK. Contact support@portkey.ai for assistance.

Next Steps

The complete list of features supported in the SDK are available on the link below.

SDK

You’ll find more information in the relevant sections:

Ecosystem

LLM Integrations

Cloud Platforms

Guardrails

Plugins

Vector Databases

Agents

AI Apps

Libraries

Tracing Providers

Portkey SDK Integration with Google Gemini Models

1. Install the Portkey SDK

2. Initialize Portkey with the Virtual Key

3. Invoke Chat Completions with Google Gemini

Function Calling

Document, Video, Audio Processing with Gemini

Sending base64 Image

Grounding with Google Search

Extended Thinking (Reasoning Models) (Beta)

Single turn conversation

Next Steps

SDK

Ecosystem

LLM Integrations

Cloud Platforms

Guardrails

Plugins

Vector Databases

Agents

AI Apps

Libraries

Tracing Providers

​Portkey SDK Integration with Google Gemini Models

​1. Install the Portkey SDK

​2. Initialize Portkey with the Virtual Key

​3. Invoke Chat Completions with Google Gemini

​Function Calling

​Document, Video, Audio Processing with Gemini

​Sending base64 Image

​Grounding with Google Search

​Extended Thinking (Reasoning Models) (Beta)

​Single turn conversation

​Next Steps

SDK

Portkey SDK Integration with Google Gemini Models

1. Install the Portkey SDK

2. Initialize Portkey with the Virtual Key

3. Invoke Chat Completions with Google Gemini

Function Calling

Document, Video, Audio Processing with Gemini

Sending base64 Image

Grounding with Google Search

Extended Thinking (Reasoning Models) (Beta)

Single turn conversation

Next Steps