Secure Self-Hosted Ollama Server

Table of Contents

Introduction 🔎
#

The goal of this post is to create a self-hosted ollama server, used for deploying and testing open source LLMs locally.

The main issue is that by default, there is no password (or any other) protection for the ollama server. This means that if we were to deploy this in our server and then expose it to the internet, in order to access if from other machines, we would have an issue as anyone could use it simply with the url

To solve this we are going to see how we can set-up a dockerized ollama server with Caddy authentication for one user.

The entire project can be found in my github

Deployment ✈
#

Starting the Ollama Server
#

Deploy our server we simply have to use the following command:

docker compose -f secure-ollama-server/docker-compose.yaml --env-file secure-ollama-server/.env up -d

In case the machine we want to run ollama on has a gpu, we have to also use the docker-compose.gpu.yaml file. The command can be seen bellow

docker compose -f secure-ollama-server/docker-compose.yaml -f secure-ollama-server/docker-compose.gpu.yaml --env-file secure-ollama-server/.env up -d

Docker will take care of the rest

Stopping the Ollama Server
#

To stop the server after we are finished using it, we have to use the following command:

docker compose -f secure-ollama-server/docker-compose.yaml --env-file secure-ollama-server/.env down

How to use ❔
#

In order to use the server, we have to add the username and the password we specified in the .env file in our request.

An example in curl can be seen bellow:

curl -u user:password 127.0.0.1:8200/api/generate -d '{"model" : "phi", "prompt" : "Why is the sky blue?", "stream" : false}'

Using an incorrect password or user will result in an empty request response.

Development 🛠
#

The following sections contain technical information about the implementation.

Enviromental Variables
#

In order for the Caddy Authentication to work, we need to set up some enviromental variables.

Those must be placed in a .env file, located in the same folder as the docker-compose.yaml

We can easily create this file by taking a look at the .env-template file. An example of our .env file is the following:

CADDY_USERNAME=admin
CADDY_PASSWORD=secret-password

Docker
#

Now, let’s take a look at the docker-compose.yaml file:

version: '3.8'

services:
  ollama:
    container_name: ollama-server
    build:
      context: .
      dockerfile: Dockerfile
    pull_policy: always
    tty: true
    restart: always
    ports:
      - 8200:80
    volumes:
      - ./ollama:/root/.ollama
    environment:
      - CADDY_USERNAME=${CADDY_USERNAME}
      - CADDY_PASSWORD=${CADDY_PASSWORD}

The ollama server is deployed on port 8200.
There is a restart-always policy placed on the container.
A volumne is required for persistence.
The volume is created in the same folder as the docker-compose.yaml.
Caddy enviromental variables are needed to set up the user and it’s password.

Caddy Setup
#

Caddy is automatically configured for us by the Dockerfile. Let’s take a look to see how this is done.

FROM ollama/ollama:latest

# Update and install wget to download caddy
RUN apt-get update && apt-get install -y wget

# Download and install caddy
RUN wget --no-check-certificate https://github.com/caddyserver/caddy/releases/download/v2.7.6/caddy_2.7.6_linux_amd64.tar.gz \
    && tar -xvf caddy_2.7.6_linux_amd64.tar.gz \
    && mv caddy /usr/bin/ \
    && chown root:root /usr/bin/caddy \
    && chmod 755 /usr/bin/caddy

# Copy the Caddyfile to the container
COPY Caddyfile /etc/caddy/Caddyfile

# Set the environment variable for the ollama host
ENV OLLAMA_HOST 0.0.0.0

# Expose the port that caddy will listen on
EXPOSE 80

# Copy a script to start both ollama and caddy
COPY start_server.sh /start_server.sh
RUN chmod +x /start_server.sh

# Set the entrypoint to the script
ENTRYPOINT ["/start_server.sh"]

We download the v2.7.6 version of the Caddy Server and install it.
The appropriate permissions and onwerships are applied.
The start_server.sh script is triggered at the end.

The start_server.sh performs the following actions:

Checks for the presence of the required Enviromental Variables
Starts the Ollama server.
Starts the Caddy server.
Handles shutdown signals, in order for the container to exit grasefully.

Introduction 🔎 #

Deployment ✈ #

Starting the Ollama Server #

Stopping the Ollama Server #

How to use ❔ #

Development 🛠 #

Enviromental Variables #

Docker #

Caddy Setup #