LLM Provider Integrations

Overview

Open WebUI supports multiple LLM providers through OpenAI-compatible API endpoints. This includes native integrations with Anthropic Claude, Google Gemini, and many others.

Supported Providers

Anthropic Claude

Claude 3 Opus, Sonnet, and Haiku models

Google AI

Gemini Pro, Gemini Ultra, and PaLM 2

OpenRouter

Access to 100+ models from multiple providers

Groq

Ultra-fast LLM inference

Mistral AI

Mistral, Mixtral, and specialized models

Together AI

Open source models with fast inference

Anthropic Claude

Configuration

OPENAI_API_BASE_URL=https://api.anthropic.com/v1
OPENAI_API_KEY=sk-ant-...
ENABLE_OPENAI_API=True

Available Models

claude-3-opus-20240229 - Most powerful model
claude-3-sonnet-20240229 - Balanced performance and speed
claude-3-haiku-20240307 - Fastest and most affordable
claude-2.1 - Previous generation

Open WebUI automatically detects Anthropic URLs and applies appropriate model mappings.File: backend/open_webui/utils/anthropic.py

Features

System prompts
Tool/function calling
Vision capabilities (Claude 3)
Long context (200K tokens)

Google AI (Gemini)

Configuration

GEMINI_API_KEY=your-api-key
GEMINI_API_BASE_URL=https://generativelanguage.googleapis.com/v1

Available Models

gemini-1.5-pro - Latest and most capable
gemini-1.5-flash - Optimized for speed
gemini-1.0-pro - Previous generation

Features

Multimodal (text, image, video, audio)
2M context window (Gemini 1.5 Pro)
Code execution
Function calling

OpenRouter

Access 100+ models from multiple providers through a single API.

Configuration

OPENAI_API_BASE_URL=https://openrouter.ai/api/v1
OPENAI_API_KEY=sk-or-v1-...

OpenRouter requires HTTP-Referer and X-Title headers. Open WebUI automatically adds these when it detects an OpenRouter URL.File: backend/open_webui/routers/openai.py:134

Popular Models

anthropic/claude-3-opus
google/gemini-pro-1.5
meta-llama/llama-3-70b-instruct
mistralai/mixtral-8x7b-instruct
openai/gpt-4-turbo

Groq

Ultra-fast inference for open source models.

Configuration

OPENAI_API_BASE_URL=https://api.groq.com/openai/v1
OPENAI_API_KEY=gsk_...

Available Models

llama-3.1-70b-versatile
llama-3.1-8b-instant
mixtral-8x7b-32768
gemma-7b-it

Features

Extremely fast inference (500+ tokens/sec)
Free tier available
OpenAI-compatible API

Mistral AI

Configuration

OPENAI_API_BASE_URL=https://api.mistral.ai/v1
OPENAI_API_KEY=your-mistral-key

Available Models

mistral-large-latest - Most capable
mistral-medium-latest - Balanced
mistral-small-latest - Fast and efficient
mixtral-8x7b - Open source MoE
codestral-latest - Code generation specialist

Together AI

Open source models with fast inference.

Configuration

OPENAI_API_BASE_URL=https://api.together.xyz/v1
OPENAI_API_KEY=your-together-key

Features

50+ open source models
Fine-tuning support
Fast inference
Competitive pricing

Perplexity AI

Configuration

OPENAI_API_BASE_URL=https://api.perplexity.ai
OPENAI_API_KEY=pplx-...

Models

llama-3.1-sonar-large-128k-online - With web search
llama-3.1-sonar-small-128k-online - Faster with web search
llama-3.1-70b-instruct - Base model

Cohere

Configuration

OPENAI_API_BASE_URL=https://api.cohere.ai/v1
OPENAI_API_KEY=your-cohere-key

Models

command-r-plus - Most capable
command-r - Balanced
command-light - Fast and efficient

Hugging Face Inference

Configuration

OPENAI_API_BASE_URL=https://api-inference.huggingface.co/models
OPENAI_API_KEY=hf_...

Usage

Access any model on Hugging Face:

model: meta-llama/Llama-2-70b-chat-hf
model: mistralai/Mixtral-8x7B-Instruct-v0.1

Multiple Provider Setup

Configure multiple providers simultaneously:

OPENAI_API_BASE_URLS="https://api.openai.com/v1;https://api.anthropic.com/v1;https://api.groq.com/openai/v1"
OPENAI_API_KEYS="sk-openai-key;sk-ant-key;gsk-groq-key"

Provider-Specific Features

Vision Models

Providers with vision support:

OpenAI: gpt-4-vision-preview, gpt-4-turbo
Anthropic: claude-3-opus, claude-3-sonnet, claude-3-haiku
Google: gemini-1.5-pro, gemini-1.5-flash

Function Calling

Providers with function/tool calling:

OpenAI: All GPT models
Anthropic: Claude 3 models
Google: Gemini models
Mistral: Most models

Streaming

All providers support streaming responses through Server-Sent Events (SSE).

Cost Optimization

Model Selection

Use smaller models (haiku, flash, small) for simple tasks

Provider Comparison

Compare costs across providers for equivalent capabilities

Caching

Enable model caching to reduce duplicate API calls

Free Tiers

Leverage free tiers from Groq, Hugging Face, etc.

Troubleshooting

Provider Not Responding

Verify API key is valid
Check base URL is correct
Ensure provider service is operational
Check firewall/network restrictions

Models Not Appearing

Some providers require specific model IDs in configuration:

{
  "OPENAI_API_CONFIGS": {
    "0": {
      "model_ids": ["model-name-1", "model-name-2"]
    }
  }
}

Rate Limiting

Each provider has different rate limits:

Use multiple API keys
Implement exponential backoff
Monitor usage through provider dashboard

Best Practices

API Key Security: Store keys in environment variables
Model Prefixing: Use prefixes to distinguish provider models
Cost Monitoring: Track usage across providers
Fallback Providers: Configure multiple providers for redundancy
Model Tagging: Use tags to categorize models by capability

References

Anthropic API: docs.anthropic.com
Google AI: ai.google.dev
OpenRouter: openrouter.ai/docs
Groq: console.groq.com/docs
Mistral: docs.mistral.ai

​Overview

​Supported Providers

Anthropic Claude

Google AI

OpenRouter

Groq

Mistral AI

Together AI

​Anthropic Claude

​Configuration

​Available Models

​Features

​Google AI (Gemini)

​Configuration

​Available Models

​Features

​OpenRouter

​Configuration

​Popular Models

​Groq

​Configuration

​Available Models

​Features

​Mistral AI

​Configuration

​Available Models

​Together AI

​Configuration

​Features

​Perplexity AI

​Configuration

​Models

​Cohere

​Configuration

​Models

​Hugging Face Inference

​Configuration

​Usage

​Multiple Provider Setup

​Provider-Specific Features

​Vision Models

​Function Calling

​Streaming

​Cost Optimization

Model Selection

Provider Comparison

Caching

Free Tiers

​Troubleshooting

​Best Practices

​References

Overview

Supported Providers

Anthropic Claude

Configuration

Available Models

Features

Google AI (Gemini)

Configuration

Available Models

Features

OpenRouter

Configuration

Popular Models

Groq

Configuration

Available Models

Features

Mistral AI

Configuration

Available Models

Together AI

Configuration

Features

Perplexity AI

Configuration

Models

Cohere

Configuration

Models

Hugging Face Inference

Configuration

Usage

Multiple Provider Setup

Provider-Specific Features

Vision Models

Function Calling

Streaming

Cost Optimization

Troubleshooting

Best Practices

References