Post

ollama

Host AI Locally

ollama

Why Run AI Locally?

Before diving into the steps, let’s explore why hosting AI locally:

  • Data Privacy: Sensitive business data stays within your infrastructure.
  • Customizability: Tailor AI models to specific business requirements without external constraints.
  • Cost Efficiency: Avoid recurring costs associated with cloud-hosted AI services.
  • Latency: Reduce response times for real-time applications.

Install Ollama

With Ollama, all your interactions with large language models happen locally without sending private data to third-party services

Ollama can run with GPU acceleration

1
curl -fsSL https://ollama.com/install.sh | sh

Verify Installation

Open your terminal and run

1
ollama --version

This should display the version of Ollama you installed.

Pull a model

Now you can pull a model like Llama 3.2

1
ollama pull llama3.2

More models can be found on the Ollama library.

Install Open WebUI

Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI interface designed to operate entirely offline. It supports various LLM runners, including Ollama and OpenAI-compatible APIs.

  • install docker
    1
    
     sudo apt install docker.io
    
  • run Open WebUI
    1
    
    sudo docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://localhost:11434 -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
    

Access the interface by opening your browser and navigating to

1
http://<your-server-ip>:3000

Troubleshooting

  • Check logs if the server or UI doesn’t start
1
2
ollama logs
sudo docker logs open-webui
  • Ensure no firewall or antivirus software is blocking the local server.

Advanced Configuration

  • Use reverse proxies like NGINX or set up HTTPS for secure access.
This post is licensed under CC BY 4.0 by the author.

Trending Tags