Share this
About
Ollama is an open source application that can run a wide range of open source LLMs (Large Language Models), and present them via an easy to use web based communication system.
See more at their website here:
Ollama itself is designed for other software to connect to and utilise the LLMs running within. The types of software that can make use of these connections vary widely across application type, industry and use case.
To simplify the initial setup, SHARON AI offers a post-install application called “Ollama Integrations” that can get you up and running quickly with Ollama. A simple open source chatbot interface called Open Web UI is presented as a way to test the LLM is working, as well as download and select new models.
Why host your own LLM?
See our KB article titled “The benefits of private LLMs”:
Installation:
Log into your client area, and order a new service. Select a GPU service (Ollama and supported LLMs can run on CPUs, however they are an order of magnitude faster on GPUs). Select a recent Ubuntu LTS distro and the “Ollama Integrations” application:
Configure the rest of the options to suit your needs, including your disk space, SSH public key, etc. Note that some models will take considerable disk space, so ensure you allow enough room to store both your operating system and model. Check the Ollama “models” page to see the models on offer, and the space they require:
NOTE: The password you set here will be applied to the default `ubuntu` user. We will need this to log in to the tools later.
When happy with your configuration, complete your order process and wait for your virtual machine to start. This process can take several minutes as the application deployment collects the various applications and drivers necessary. Output can be seen in the files `/var/log/cloud-init.log` and `/var/log/cloud-init-output.log`, and for newer distributions followed via the systemd-journal logger using the command `sudo journalctl -f`.
First boot, Open-WebUI
On first boot, once you’ve verified the applications have installed and are available (see the installation notes above on how to verify that), you can log into Open-WebUI to test Ollama and download your first model.
Find your system IP in your SHARON AI billing dashboard, and log into that via a web browser on port 3000. So if your IP is “123.456.789.123”, your URL would be “http://123.456.789.123:3000“.
You’ll be presented with the Open-WebUI new user screen. Here you can create a completely private and offline user account:
Once created, you’ll be logged in and presented with some information about recent Open-WebUI updates.
To download an LLM, head to the Ollama model search page:
And search for a model that you wish to use. For this example, I’ll choose Meta’s Llama 3.2 model, and specify the smallest one at 1 billion parameters:
It tells me that I would run it via the command line with the command “ollama run llama3.2:1b”. However I’ll download this via the Open-WebUI interface, simply by copying the “llama3.2:1b” string, and in the top-left corner of Open-WebUI, choosing “select a model” and pasting it in there.
I’ll be given the option to Pull the model from ollama.com, and on doing so Open-WebUI will download and install that model for me. This particular model is tiny at only 1.3GB (some models can exceed 400GB), so it will download and install quickly.
Once installed, if I again click “select a model”, this time the installed model will show up:
I can select it, and ask the model to identify itself to prove it’s working. Hovering over the “i” information icon below the response will also give you some interesting statistics about the processing speed and number of tokens that the request and answer took, which is a great way to compare CPUs and GPUs:
Next steps
From here, you can start integrating other tools into Ollama. See our guides for the following examples:
- Integrate Ollama with Microsoft VSCode as a private alternative to Copilot
- URL
- Integrate Ollama with an in-browser assistant as a private website summary tool:
- URL
- Integrate Ollama with n8n, the enterprise open source process and workflow automation tool:
- URL