Overview
Ollama is a local LLM runner that allows you to run large language models on your own hardware. Open WebUI provides native integration with Ollama, supporting both local and remote Ollama instances.Quick Start
Install Ollama
Download and install Ollama from ollama.ai
Configuration
Environment Variables
Admin Panel Configuration
Navigate to Admin Panel > Settings > Connections to configure Ollama:- Enable Ollama API: Toggle to enable/disable Ollama integration
- Base URLs: Add one or more Ollama server URLs
- API Configurations: Configure advanced settings per instance
Advanced Configuration
Multiple Ollama Instances
Open WebUI supports load balancing across multiple Ollama instances:Authentication
For secured Ollama instances:Authorization: Bearer {key}
Model Filtering
Filter specific models from an Ollama instance:Model Prefixing
Add prefixes to distinguish models from different instances:local.llama2 and remote.llama2.
API Endpoints
Open WebUI proxies the following Ollama API endpoints:Model Management
-
GET /ollama/api/tags- List available models
File: backend/open_webui/routers/ollama.py:448 -
POST /ollama/api/pull- Pull a model from registry
File: backend/open_webui/routers/ollama.py:708 -
POST /ollama/api/create- Create a model from Modelfile
File: backend/open_webui/routers/ollama.py:784 -
DELETE /ollama/api/delete- Delete a model
File: backend/open_webui/routers/ollama.py:874 -
POST /ollama/api/show- Show model information
File: backend/open_webui/routers/ollama.py:943
Inference
-
POST /ollama/api/generate- Generate completion
File: backend/open_webui/routers/ollama.py:1192 -
POST /ollama/api/chat- Chat completion
File: backend/open_webui/routers/ollama.py:1281 -
POST /ollama/api/embed- Generate embeddings
File: backend/open_webui/routers/ollama.py:1014
OpenAI Compatible
-
POST /ollama/v1/chat/completions- OpenAI-compatible chat
File: backend/open_webui/routers/ollama.py:1496 -
POST /ollama/v1/completions- OpenAI-compatible completions
File: backend/open_webui/routers/ollama.py:1412
Docker Integration
All-in-One Container
Separate Containers
docker-compose.yml
Troubleshooting
Connection Errors
Cannot connect to Ollama
Cannot connect to Ollama
Docker Network Issues:If using Docker, ensure you’re using the correct hostname:
- Same machine:
http://host.docker.internal:11434 - Different container:
http://ollama:11434 - Network mode host:
http://localhost:11434
Models not appearing
Models not appearing
- Verify Ollama is running:
ollama list - Check ENABLE_OLLAMA_API is set to
True - Refresh the models list in the UI
- Check browser console for errors
Port 11434 in use
Port 11434 in use
Open WebUI will automatically try port 12434 as fallback.File: backend/open_webui/config.py:1046
Performance Optimization
Load Balancing: The current implementation uses random selection for routing requests.
For production deployments, consider implementing weighted round-robin or least-connections algorithms.File: backend/open_webui/routers/ollama.py:1
User Info Forwarding
Forward user information to Ollama for logging and access control:X-OpenWebUI-User-NameX-OpenWebUI-User-IdX-OpenWebUI-User-EmailX-OpenWebUI-User-RoleX-OpenWebUI-Chat-Id
Best Practices
Use Model Prefixes
Distinguish models from different instances with prefixes
Monitor Resources
Use
GET /ollama/api/ps to see loaded models and memory usageEnable Caching
Models are cached for better performance (default: 5 minutes TTL)
GPU Allocation
Configure model-specific GPU allocation in Ollama Modelfile