Google Gemini

Quick Reference

Base URL: https://generativelanguage.googleapis.com/v1beta
Authentication: Bearer token (OpenAI endpoint) OR x-goog-api-key (native endpoint)
API Format: Dual support - OpenAI-compatible AND native Google format
Usage Tracking: data.usage (OpenAI endpoint) OR data.usageMetadata (native)
BYOK Support: ✓ Supported

Current Models (October 2025)

Gemini 2.5 Series (Latest)

gemini-2.5-pro - State-of-the-art thinking model with 1M token context
gemini-2.5-flash - Best price/performance balance, 1M context
gemini-2.5-flash-lite - Fastest, most cost-efficient option
gemini-2.5-flash-image - Image generation support
gemini-2.5-pro-preview-tts - Text-to-speech capabilities
gemini-2.5-flash-preview-tts - TTS (faster variant)

Gemini 2.0 Series

gemini-2.0-flash - Second generation workhorse model
gemini-2.0-flash-lite - Small, efficient workhorse
gemini-2.0-flash-preview-image-generation - Image generation

Integration Example

Google Gemini supports two API formats: OpenAI-compatible (recommended for simplicity) and native Google format (for advanced features).

Prerequisites

Get your Lava forward token:
- Navigate to Build > Secret Keys
- Find “Self Forward Token” section
- Click copy icon
Set up environment variables:

.env.local

LAVA_BASE_URL=https://api.lavapayments.com/v1
LAVA_FORWARD_TOKEN=your_forward_token_from_dashboard

Run from backend server (CORS blocks frontend requests for security)

Option 1: OpenAI-Compatible Endpoint (Recommended)

/**
 * Google Gemini Chat Completion via Lava (OpenAI-Compatible)
 *
 * Benefits: Standard OpenAI format, easier integration
 * Endpoint: /v1beta/openai/chat/completions
 * Auth: Bearer token
 */

require('dotenv').config({ path: '.env.local' });

async function callGeminiViaLava() {
  // 1. Define the Google OpenAI-compatible endpoint
  const PROVIDER_ENDPOINT = 'https://generativelanguage.googleapis.com/v1beta/openai/chat/completions';

  // 2. Build the Lava forward proxy URL
  const url = `${process.env.LAVA_BASE_URL}/forward?u=${PROVIDER_ENDPOINT}`;

  // 3. Set up authentication headers
  const headers = {
    'Content-Type': 'application/json',
    'Authorization': `Bearer ${process.env.LAVA_FORWARD_TOKEN}`
  };

  // 4. Define the request body (standard OpenAI format)
  const requestBody = {
    model: 'gemini-2.5-flash',  // Use any Gemini model
    messages: [
      { role: 'system', content: 'You are a helpful assistant.' },
      { role: 'user', content: 'Explain quantum computing in one sentence.' }
    ],
    temperature: 0.7,
    max_tokens: 1024
  };

  // 5. Make the request
  try {
    const response = await fetch(url, {
      method: 'POST',
      headers: headers,
      body: JSON.stringify(requestBody)
    });

    // 6. Parse the response
    const data = await response.json();

    // 7. Extract usage data (standard OpenAI format)
    const usage = data.usage;
    console.log('\nUsage Tracking:');
    console.log(`  Prompt tokens: ${usage.prompt_tokens}`);
    console.log(`  Completion tokens: ${usage.completion_tokens}`);
    console.log(`  Total tokens: ${usage.total_tokens}`);

    // 8. Extract request ID (from response header)
    const requestId = response.headers.get('x-lava-request-id');
    console.log(`\nLava Request ID: ${requestId}`);
    console.log('  (Use this ID to find the request in your dashboard)');

    // 9. Display the AI response
    console.log('\nAI Response:');
    console.log(data.choices[0].message.content);

    return data;
  } catch (error) {
    console.error('Error calling Gemini via Lava:', error.message);
    throw error;
  }
}

// Run the example
callGeminiViaLava();

Option 2: Native Google Format

/**
 * Google Gemini Native API via Lava
 *
 * Benefits: Access to advanced Google-specific features
 * Endpoint: /v1beta/models/{model}:generateContent
 * Auth: x-goog-api-key header
 */

require('dotenv').config({ path: '.env.local' });

async function callGeminiNativeViaLava() {
  // 1. Define the Google native endpoint (model in URL path)
  const model = 'gemini-2.5-flash';
  const PROVIDER_ENDPOINT = `https://generativelanguage.googleapis.com/v1beta/models/${model}:generateContent`;

  // 2. Build the Lava forward proxy URL
  const url = `${process.env.LAVA_BASE_URL}/forward?u=${PROVIDER_ENDPOINT}`;

  // 3. Set up authentication headers (Google-specific)
  const headers = {
    'Content-Type': 'application/json',
    'x-goog-api-key': process.env.LAVA_FORWARD_TOKEN  // Note: x-goog-api-key header
  };

  // 4. Define the request body (Google native format)
  const requestBody = {
    contents: [{
      role: 'user',
      parts: [{ text: 'Explain quantum computing in one sentence.' }]
    }],
    generationConfig: {
      temperature: 0.7,
      maxOutputTokens: 1024
    }
  };

  // 5. Make the request
  try {
    const response = await fetch(url, {
      method: 'POST',
      headers: headers,
      body: JSON.stringify(requestBody)
    });

    // 6. Parse the response
    const data = await response.json();

    // 7. Extract usage data (Google native format)
    const usage = data.usageMetadata;
    console.log('\nUsage Tracking:');
    console.log(`  Prompt tokens: ${usage.promptTokenCount}`);
    console.log(`  Candidates tokens: ${usage.candidatesTokenCount}`);
    console.log(`  Total tokens: ${usage.totalTokenCount}`);

    // 8. Extract request ID (from response header)
    const requestId = response.headers.get('x-lava-request-id');
    console.log(`\nLava Request ID: ${requestId}`);

    // 9. Display the AI response (Google native structure)
    console.log('\nAI Response:');
    console.log(data.candidates[0].content.parts[0].text);

    return data;
  } catch (error) {
    console.error('Error calling Gemini native API via Lava:', error.message);
    throw error;
  }
}

// Run the example
callGeminiNativeViaLava();

Request/Response Formats

OpenAI-Compatible Format

Request:

{
  "model": "gemini-2.5-flash",
  "messages": [
    { "role": "system", "content": "You are helpful." },
    { "role": "user", "content": "Hello!" }
  ],
  "temperature": 0.7,
  "max_tokens": 1024
}

Response:

{
  "id": "chatcmpl-xxx",
  "object": "chat.completion",
  "created": 1234567890,
  "model": "gemini-2.5-flash",
  "choices": [{
    "index": 0,
    "message": {
      "role": "assistant",
      "content": "Hello! How can I help you?"
    },
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 8,
    "total_tokens": 18
  }
}

Native Google Format

Request:

{
  "contents": [{
    "role": "user",
    "parts": [{ "text": "Hello, Gemini!" }]
  }],
  "generationConfig": {
    "temperature": 0.7,
    "maxOutputTokens": 1024
  }
}

Response:

{
  "candidates": [{
    "content": {
      "parts": [{ "text": "Hello! How can I help you?" }],
      "role": "model"
    },
    "finishReason": "STOP"
  }],
  "usageMetadata": {
    "promptTokenCount": 5,
    "candidatesTokenCount": 9,
    "totalTokenCount": 14,
    "thoughtsTokenCount": 0
  }
}

Key Features

Text, images, video, audio: Send multiple content types in single request
1M token context: Process extremely long documents (Gemini 2.5 Pro/Flash)
Vision understanding: Analyze images and diagrams

Advanced Features

Thinking mode: Extended reasoning for complex tasks
Code execution: Built-in code interpreter for mathematical computations
Google Search grounding: Real-time web search integration
Google Maps grounding: Location-based context and information

Context and Limits

Input: Up to 1,048,576 tokens (1M context window)
Output: 8,192 to 65,536 tokens (model-dependent)

Usage Tracking

OpenAI-Compatible Endpoint

Usage data is available in the response body at data.usage:

{
  "usage": {
    "prompt_tokens": 100,
    "completion_tokens": 50,
    "total_tokens": 150
  }
}

Access in code:

const usage = data.usage;
console.log(`Total tokens: ${usage.total_tokens}`);

Native Endpoint

Usage data is available at data.usageMetadata:

{
  "usageMetadata": {
    "promptTokenCount": 100,
    "candidatesTokenCount": 50,
    "totalTokenCount": 150,
    "thoughtsTokenCount": 0  // Reasoning tokens (if using thinking mode)
  }
}

Access in code:

const usage = data.usageMetadata;
console.log(`Total tokens: ${usage.totalTokenCount}`);

BYOK Support

Google Gemini fully supports Bring Your Own Key (BYOK) mode. Your forward token format:

${LAVA_SECRET_KEY}.${CONNECTION_SECRET}.${PRODUCT_SECRET}.${YOUR_GOOGLE_API_KEY}

Note: When using BYOK, Lava meters usage but does not charge your Lava wallet. Costs are billed directly to your Google Cloud account. For detailed BYOK setup, see the BYOK guide.

Getting Started

Integration Guides

Core Concepts

Provider Reference

SDK Reference

Quick Reference

Current Models (October 2025)

Gemini 2.5 Series (Latest)

Gemini 2.0 Series

Integration Example

Prerequisites

Option 1: OpenAI-Compatible Endpoint (Recommended)

Option 2: Native Google Format

Request/Response Formats

OpenAI-Compatible Format

Native Google Format

Key Features

Advanced Features

Context and Limits

Usage Tracking

OpenAI-Compatible Endpoint

Native Endpoint

BYOK Support

Official Documentation

Getting Started

Integration Guides

Core Concepts

Provider Reference

SDK Reference

​Quick Reference

​Current Models (October 2025)

​Gemini 2.5 Series (Latest)

​Gemini 2.0 Series

​Integration Example

​Prerequisites

​Option 1: OpenAI-Compatible Endpoint (Recommended)

​Option 2: Native Google Format

​Request/Response Formats

​OpenAI-Compatible Format

​Native Google Format

​Key Features

​Multi-Modal Capabilities

​Advanced Features

​Context and Limits

​Usage Tracking

​OpenAI-Compatible Endpoint

​Native Endpoint

​BYOK Support

​Official Documentation

Quick Reference

Current Models (October 2025)

Gemini 2.5 Series (Latest)

Gemini 2.0 Series

Integration Example

Prerequisites

Option 1: OpenAI-Compatible Endpoint (Recommended)

Option 2: Native Google Format

Request/Response Formats

OpenAI-Compatible Format

Native Google Format

Key Features

Multi-Modal Capabilities

Advanced Features

Context and Limits

Usage Tracking

OpenAI-Compatible Endpoint

Native Endpoint

BYOK Support

Official Documentation