Host ALL your AI locally

Build an AI server in your myDRE Workspace and run all of AI models locally with a beautiful chat interface!

It's customizable and super fast, much faster than anything else I've used. And again, it's local, it's private. You control it!

Isn't that amazing?

Let's demo the entire setup on this Linux Virtual Machine 4 cores, 28 GB RAM Standard_NC4as_T4_v3.

It's not limited to GPU machines only!

When using AI, LLMs, and GPTs, especially when using external services, consult your CISO, DPO, and Legal. At the very least go over all the fine prints!

Domains to be allowlisted:

microsoft.com

developer.download.nvidia.com

archive.ubuntu.com

docker.io

docker.com

ollama.com

After installing, it's best to remove any domains you don't need!

Install Cuda and GPU drivers for Ubuntu20

Install NVIDIA GPU drivers on N-Series VMs running Linux

Run the following lines before installing anything else:

sudo apt update
sudo apt upgrade -y

Install OLLAMA

Now that you have a Linux VM up and running with a GPU and the CUDA drivers installed, go to https://ollama.com to download what we'll use to run AI

This is the foundation for all of our AI stuff.

Install with one command:

curl -fsSL https://ollama.com/install.sh | sh

If you do not see that, you may have to install NVIDIA Cuda drivers!

It can be set up and utilized even on a computer without a GPU.

Test if OLLAMA is working:

Open the browser and search for localhost:11434. If you see "Ollama is running", you are good to go. This port (11434) is where OLLAMA's API services is running

Add AI model to OLLAMA:

ollama pull llama2

Let's test it out by opening a new command window and run

watch -n 0.5 nvidia-smi

This is going to "watch" the performance of the GPU right here in the Terminal and keep refreshing it every 5 seconds.

Run llama2 and ask a question

ollama run llama2

Open WebUI

This will be running inside a Docker container, so you will need Docker installed.

Run the commands below:

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

Execute the following commands

# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

Install Docker

#Install Docker
sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

Run Open WebUI Docker Container

Now with Docker installed we'll use it to deploy our open web UI container.

This Docker command is going to pull an image to run the container from OpenWebUI. It is looking at your local computer for the llama base URL, because it is going to integrate and use Llama, and it's going to be using the host network adapter to make things nice and easy. Keep in mind this will use Port8080 on whatever system you are using.

sudo docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

Run the following to verify if Docker is installed

sudo docker ps

Start Using OpenWebAI on the Virtual Machine

Log in to WebUI in the browser by typing localhost:8080 and press enter. Now for the first time you run it, you want to click on Sign up at the bottom and add your details. This log info is only pertinent to this local instance.

The first account you log in with or sign up with, will automatically become an admin account.

As a first time user logging in, you get the power.

Select the model(s)

By default llama2 should be available, that is how we know also our connection is working.

We can download more models by going to https://ollama.com/library and see what they have available. Look at their models to see their list of models. codegemma is a big one, let's try that.

Go back to our command line and type in

ollama pull codegemma

Let's test it out and see the GPU being used

And just like that we have our Chat GPT that is completely local.

Let's bring this to another level!

On the same workspace, one virtual machine (VM) running Windows or Linux can use a large language model running on another VM, also with Windows or Linux. This setup allows sharing the language model across different virtual environments.

Exploring Additional Possibilities

Upload Your Own Files and Pictures

Since it's running on your own system, you can easily upload any files or pictures you want without sharing your data.

Control Who Can Use It

As the admin, you get to decide who can sign up and use your local AI server. You can turn signups on or off, and even make it so new users have to be approved by you before they can access it. This is great for keeping things private or monitoring what your kids are doing.

Choose Which AI Models Are Allowed

You can pick which specific AI models are allowed to be used. For example, if you want to limit what kinds of responses the invited users see, you can only allow certain "user-friendly" models.

Make Your Own Custom AI Models

The really cool part is that you can actually create your own custom AI models from scratch. You decide what knowledge to put into them and what rules they follow. This lets you build AI assistants that are perfect for your specific needs.Overall, having a local GPT gives you tons of control and flexibility. You can make it as open or as locked down as you want, and even build entirely new AI assistants tailored just for you.

Related Articles
AI, LLMs, GPTs and myDRE
When using AI, LLMs, and GPTs, especially when using external services, consult your CISO, DPO, and Legal. At the very least go over all the fine prints! AI, LLMs, and GPTs will most likely, or even already are, part of processing data in general. ...
Julius AI
What is Julius AI? Julius.ai is an AI-powered data analysis platform that simplifies complex data analysis and visualization tasks through a conversational interface, making it accessible to non-technical users. Please read Julius.ai privacy policy ...
Perplexity.ai
When using AI, LLMs, and GPTs, especially when using external services, consult your CISO, DPO, and Legal. At the very least go over all the fine prints! More information on https://www.perplexity.ai/ To be used as an external service. If a ...
Federated Learning on myDRE with FlowerAI
Setting Up a Federated Learning Environment on myDRE with Flower AI (github flower) The system enables institutions to collectively train AI models on their local data without sharing sensitive information, enabling secure collaboration in medical ...
Windows-RUMC-RTCMicroscopy 2.0.1
Important! This image may only be deployed in Radboudumc workspaces. It is not permitted to deploy the template in a workspace of a different tenant. Thanks for understanding. The RTCMicroscopy template for virtual machines (VMs) in the DRE is aimed ...

Host ALL your AI locally

Host ALL your AI locally

Build an AI server in your myDRE Workspace and run all of AI models locally with a beautiful chat interface!

Domains to be allowlisted:

Install Cuda and GPU drivers for Ubuntu20

Install OLLAMA

Test if OLLAMA is working:

Add AI model to OLLAMA:

Open WebUI

Run Open WebUI Docker Container

Start Using OpenWebAI on the Virtual Machine

Select the model(s)

Let's bring this to another level!

Exploring Additional Possibilities

Related Articles

AI, LLMs, GPTs and myDRE

Julius AI

Perplexity.ai

Federated Learning on myDRE with FlowerAI

Windows-RUMC-RTCMicroscopy 2.0.1