Skip to main content

Overview

DeepInfra offers serverless GPU inference for popular open-source models with pay-per-use pricing and no infrastructure management required. Key Features:
  • Serverless deployment (no GPU management)
  • 100+ open-source models
  • OpenAI-compatible API
  • Automatic scaling
Official Documentation: DeepInfra Docs

Authentication

DeepInfra uses Bearer token authentication with OpenAI-compatible format. Lava Forward Token:
${LAVA_SECRET_KEY}.${CONNECTION_SECRET}.${PRODUCT_SECRET}
For BYOK: ${TOKEN}.${YOUR_DEEPINFRA_KEY}
ModelContextDescription
meta-llama/Meta-Llama-3.3-70B-Instruct128KLlama 3.3 flagship
mistralai/Mixtral-8x7B-Instruct-v0.132KMistral MoE model
Qwen/Qwen2.5-72B-Instruct128KQwen multilingual
Endpoint: https://api.deepinfra.com/v1/openai/chat/completions Usage Tracking: data.usage (OpenAI format)

Quick Start

const response = await fetch(
  `https://api.lavapayments.com/v1/forward?u=${encodeURIComponent('https://api.deepinfra.com/v1/openai/chat/completions')}`,
  {
    method: 'POST',
    headers: {
      'Authorization': `Bearer ${process.env.LAVA_FORWARD_TOKEN}`,
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      model: 'meta-llama/Meta-Llama-3.3-70B-Instruct',
      messages: [{role: 'user', content: 'Hello!'}]
    })
  }
);

const data = await response.json();
console.log('Usage:', data.usage.total_tokens);
console.log('Request ID:', response.headers.get('x-lava-request-id'));

BYOK Support

Supported - Get API key from DeepInfra Console

Resources