Introduction 🔎 #
The goal of this post is to create a self-hosted ollama server, used for deploying and testing open source LLMs locally.
The main issue is that by default, there is no password (or any other) protection for the ollama server. This means that if we were to deploy this in our server and then expose it to the internet, in order to access if from other machines, we would have an issue as anyone could use it simply with the url
To solve this we are going to see how we can set-up a dockerized ollama server with Caddy authentication for one user.
Deployment ✈ #
Starting the Ollama Server #
Deploy our server we simply have to use the following command:
docker compose -f secure-ollama-server/docker-compose.yaml --env-file secure-ollama-server/.env up -d
docker compose -f secure-ollama-server/docker-compose.yaml -f secure-ollama-server/docker-compose.gpu.yaml --env-file secure-ollama-server/.env up -d
Stopping the Ollama Server #
To stop the server after we are finished using it, we have to use the following command:
docker compose -f secure-ollama-server/docker-compose.yaml --env-file secure-ollama-server/.env down
How to use ❔ #
In order to use the server, we have to add the username and the password we specified in the .env file in our request.
An example in curl can be seen bellow:
curl -u user:password 127.0.0.1:8200/api/generate -d '{"model" : "phi", "prompt" : "Why is the sky blue?", "stream" : false}'
Development 🛠 #
The following sections contain technical information about the implementation.
Enviromental Variables #
In order for the Caddy Authentication to work, we need to set up some enviromental variables.
Those must be placed in a .env
file, located in the same folder as the docker-compose.yaml
We can easily create this file by taking a look at the .env-template
file. An example of our .env
file is the following:
CADDY_USERNAME=admin
CADDY_PASSWORD=secret-password
Docker #
Now, let’s take a look at the docker-compose.yaml
file:
version: '3.8'
services:
ollama:
container_name: ollama-server
build:
context: .
dockerfile: Dockerfile
pull_policy: always
tty: true
restart: always
ports:
- 8200:80
volumes:
- ./ollama:/root/.ollama
environment:
- CADDY_USERNAME=${CADDY_USERNAME}
- CADDY_PASSWORD=${CADDY_PASSWORD}
- The ollama server is deployed on port 8200.
- There is a restart-always policy placed on the container.
- A volumne is required for persistence.
- The volume is created in the same folder as the docker-compose.yaml.
- Caddy enviromental variables are needed to set up the user and it’s password.
Caddy Setup #
Caddy is automatically configured for us by the Dockerfile
. Let’s take a look to see how this is done.
FROM ollama/ollama:latest
# Update and install wget to download caddy
RUN apt-get update && apt-get install -y wget
# Download and install caddy
RUN wget --no-check-certificate https://github.com/caddyserver/caddy/releases/download/v2.7.6/caddy_2.7.6_linux_amd64.tar.gz \
&& tar -xvf caddy_2.7.6_linux_amd64.tar.gz \
&& mv caddy /usr/bin/ \
&& chown root:root /usr/bin/caddy \
&& chmod 755 /usr/bin/caddy
# Copy the Caddyfile to the container
COPY Caddyfile /etc/caddy/Caddyfile
# Set the environment variable for the ollama host
ENV OLLAMA_HOST 0.0.0.0
# Expose the port that caddy will listen on
EXPOSE 80
# Copy a script to start both ollama and caddy
COPY start_server.sh /start_server.sh
RUN chmod +x /start_server.sh
# Set the entrypoint to the script
ENTRYPOINT ["/start_server.sh"]
- We download the v2.7.6 version of the Caddy Server and install it.
- The appropriate permissions and onwerships are applied.
- The
start_server.sh
script is triggered at the end.
The start_server.sh
performs the following actions:
- Checks for the presence of the required Enviromental Variables
- Starts the Ollama server.
- Starts the Caddy server.
- Handles shutdown signals, in order for the container to exit grasefully.