Docker Deployment - Open WebUI

Open WebUI can be easily deployed using Docker. This guide covers various Docker deployment scenarios including standalone, with bundled Ollama, and GPU support.

Prerequisites

Docker installed on your system
For GPU support: Nvidia CUDA container toolkit on Linux/WSL

When using Docker to install Open WebUI, make sure to include the -v open-webui:/app/backend/data in your Docker command. This step is crucial as it ensures your database is properly mounted and prevents any loss of data.

Quick Start

If Ollama is on Your Computer

Connect to Ollama running on the same machine:

docker run -d -p 3000:8080 --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Access Open WebUI at http://localhost:3000

If Ollama is on a Different Server

Connect to Ollama on another server by setting the OLLAMA_BASE_URL:

docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=https://example.com -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

OpenAI API Only

If you’re only using OpenAI API without Ollama:

docker run -d -p 3000:8080 -e OPENAI_API_KEY=your_secret_key -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

GPU Support

NVIDIA GPU with CUDA

Run Open WebUI with NVIDIA GPU support:

docker run -d -p 3000:8080 --gpus all --add-host=host.docker.internal:host-gateway -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:cuda

Bundled Ollama Installation

These images bundle Open WebUI with Ollama in a single container:

docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama

Docker Compose

Basic Setup

Create a docker-compose.yaml file:

docker-compose.yaml

services:
  ollama:
    volumes:
      - ollama:/root/.ollama
    container_name: ollama
    pull_policy: always
    tty: true
    restart: unless-stopped
    image: ollama/ollama:${OLLAMA_DOCKER_TAG-latest}

  open-webui:
    build:
      context: .
      dockerfile: Dockerfile
    image: ghcr.io/open-webui/open-webui:${WEBUI_DOCKER_TAG-main}
    container_name: open-webui
    volumes:
      - open-webui:/app/backend/data
    depends_on:
      - ollama
    ports:
      - ${OPEN_WEBUI_PORT-3000}:8080
    environment:
      - 'OLLAMA_BASE_URL=http://ollama:11434'
      - 'WEBUI_SECRET_KEY='
    extra_hosts:
      - host.docker.internal:host-gateway
    restart: unless-stopped

volumes:
  ollama: {}
  open-webui: {}

Run with:

docker compose up -d

GPU Support with Docker Compose

Create docker-compose.gpu.yaml to extend the base configuration:

docker-compose.gpu.yaml

services:
  ollama:
    # GPU support
    deploy:
      resources:
        reservations:
          devices:
            - driver: ${OLLAMA_GPU_DRIVER-nvidia}
              count: ${OLLAMA_GPU_COUNT-1}
              capabilities:
                - gpu

Run with GPU support:

docker compose -f docker-compose.yaml -f docker-compose.gpu.yaml up -d

Available Docker Images

Open WebUI provides several image tags:

Tag	Description
`main`	Latest stable release
`dev`	Development branch (unstable)
`cuda`	CUDA GPU support
`ollama`	Bundled with Ollama

For CUDA acceleration, use the :cuda tag. For bundled Ollama, use the :ollama tag.

Network Configuration

Host Network Mode

If you’re experiencing connection issues, use host networking:

docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

With --network=host, the port changes from 3000 to 8080, so access at http://localhost:8080

Build Arguments

When building from the Dockerfile, you can use these build arguments:

docker build \
  --build-arg USE_CUDA=false \
  --build-arg USE_OLLAMA=false \
  --build-arg USE_SLIM=false \
  --build-arg USE_CUDA_VER=cu128 \
  --build-arg USE_EMBEDDING_MODEL=sentence-transformers/all-MiniLM-L6-v2 \
  --build-arg USE_RERANKING_MODEL="" \
  --build-arg BUILD_HASH=dev-build \
  -t open-webui .

Build Arguments Reference

USE_CUDA: Enable CUDA support (default: false)
USE_OLLAMA: Bundle Ollama in the image (default: false)
USE_SLIM: Use slim build without downloading models (default: false)
USE_CUDA_VER: CUDA version - cu121 for CUDA 12, cu128 for CUDA 12.8 (default: cu128)
USE_EMBEDDING_MODEL: Sentence transformer model (default: sentence-transformers/all-MiniLM-L6-v2)
USE_RERANKING_MODEL: Optional reranking model
USE_AUXILIARY_EMBEDDING_MODEL: Auxiliary embedding model (default: TaylorAI/bge-micro-v2)
BUILD_HASH: Build version hash
UID / GID: User and group IDs (default: 0 for root)

If you change the embedding model, you won’t be able to use RAG Chat with your previous documents. You’ll need to re-embed them.

Container Environment

The container runs on port 8080 internally and includes:

Python 3.11.14
Node.js 22 (for frontend build)
FFmpeg for media processing
Pandoc for document conversion

Health Check

The container includes a health check that verifies the service is running:

curl --silent --fail http://localhost:8080/health | jq -ne 'input.status == true'

Offline Mode

For offline environments, set the HF_HUB_OFFLINE environment variable:

docker run -d -p 3000:8080 -e HF_HUB_OFFLINE=1 -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Next Steps

Environment Variables

Configure Open WebUI with environment variables

Reverse Proxy

Set up nginx or Apache reverse proxy

Updating

Keep your installation up-to-date

Kubernetes

Deploy on Kubernetes

​Prerequisites

​Quick Start

​If Ollama is on Your Computer

​If Ollama is on a Different Server

​OpenAI API Only

​GPU Support

​NVIDIA GPU with CUDA

​Bundled Ollama Installation

​Docker Compose

​Basic Setup

​GPU Support with Docker Compose

​Available Docker Images

​Network Configuration

​Host Network Mode

​Build Arguments

​Build Arguments Reference

​Container Environment

​Health Check

​Offline Mode

​Next Steps

Environment Variables

Reverse Proxy

Updating

Kubernetes

Prerequisites

Quick Start

If Ollama is on Your Computer

If Ollama is on a Different Server

OpenAI API Only

GPU Support

NVIDIA GPU with CUDA

Bundled Ollama Installation

Docker Compose

Basic Setup

GPU Support with Docker Compose

Available Docker Images

Network Configuration

Host Network Mode

Build Arguments

Build Arguments Reference

Container Environment

Health Check

Offline Mode

Next Steps