ollama
Host AI Locally
ollama
Why Run AI Locally?
Before diving into the steps, let’s explore why hosting AI locally:
- Data Privacy: Sensitive business data stays within your infrastructure.
- Customizability: Tailor AI models to specific business requirements without external constraints.
- Cost Efficiency: Avoid recurring costs associated with cloud-hosted AI services.
- Latency: Reduce response times for real-time applications.
Install Ollama
With Ollama, all your interactions with large language models happen locally without sending private data to third-party services
Ollama can run with GPU acceleration
1
curl -fsSL https://ollama.com/install.sh | sh
Verify Installation
Open your terminal and run
1
ollama --version
This should display the version of Ollama you installed.
Pull a model
Now you can pull a model like Llama 3.2
1
ollama pull llama3.2
More models can be found on the Ollama library.
Install Open WebUI
Open WebUI is an extensible, feature-rich, and user-friendly self-hosted AI interface designed to operate entirely offline. It supports various LLM runners, including Ollama and OpenAI-compatible APIs.
- install docker
1
sudo apt install docker.io
- run Open WebUI
1
sudo docker run -d -p 3000:8080 -e OLLAMA_BASE_URL=http://localhost:11434 -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Access the interface by opening your browser and navigating to
1
http://<your-server-ip>:3000
Troubleshooting
- Check logs if the server or UI doesn’t start
1
2
ollama logs
sudo docker logs open-webui
- Ensure no firewall or antivirus software is blocking the local server.
Advanced Configuration
- Use reverse proxies like NGINX or set up HTTPS for secure access.
This post is licensed under CC BY 4.0 by the author.