ollama

Host AI Locally

Posted Dec 28, 2024

image

By ahmed bouayed

1 min read

ollama

Why Run AI Locally?

Before diving into the steps, let’s explore why hosting AI locally:

Data Privacy: Sensitive business data stays within your infrastructure.
Customizability: Tailor AI models to specific business requirements without external constraints.
Cost Efficiency: Avoid recurring costs associated with cloud-hosted AI services.
Latency: Reduce response times for real-time applications.

Install Ollama

With Ollama, all your interactions with large language models happen locally without sending private data to third-party services

Ollama can run with GPU acceleration

curl -fsSL https://ollama.com/install.sh | sh

Verify Installation

Open your terminal and run

ollama --version

This should display the version of Ollama you installed.

Pull a model

Now you can pull a model like Llama 3.2

ollama pull llama3.2

More models can be found on the Ollama library.

Install Open WebUI

Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI interface designed to operate entirely offline. It supports various LLM runners, including Ollama and OpenAI-compatible APIs.

install docker
1 sudo apt install docker.io

run Open WebUI

        
      
sudo docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://localhost:11434 -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Access the interface by opening your browser and navigating to

http://localhost:3000

Troubleshooting

Check logs if the server or UI doesn’t start

ollama logs
sudo docker logs open-webui

Ensure no firewall or antivirus software is blocking the local server.

Advanced Configuration

Use reverse proxies like NGINX or set up HTTPS for secure access.

docker ai llm

This post is licensed under CC BY 4.0 by the author.