Build an AI server in your myDRE Workspace and run all of AI models locally with a beautiful chat interface!
It's customizable and super fast, much faster than anything else I've used. And again, it's local, it's private. You control it!
Isn't that amazing?
Let's demo the entire setup on this Linux Virtual Machine 4 cores, 28 GB RAM Standard_NC4as_T4_v3.
It's not limited to GPU machines only!
When using AI, LLMs, and GPTs, especially when using external services, consult your CISO, DPO, and Legal. At the very least go over all the fine prints!
Domains to be allowlisted:
microsoft.com
developer.download.nvidia.com
archive.ubuntu.com
docker.io
docker.com
ollama.com
After installing, it's best to remove any domains you don't need!
Install Cuda and GPU drivers for Ubuntu20
Run the following lines before installing anything else:
- sudo apt update
- sudo apt upgrade -y
Install OLLAMA
Now that you have a Linux VM up and running with a GPU and the CUDA drivers installed, go to
https://ollama.com to download what
we'll use to run AI
This is the foundation for all of our AI stuff.
Install with one command:
If you do not see that, you may have to install NVIDIA Cuda drivers!
It can be set up and utilized even on a computer without a GPU.
Test if OLLAMA is working:
Open the browser and search for localhost:11434. If you see "Ollama is running", you are good to go. This port (11434) is where OLLAMA's API services is running
Add AI model to OLLAMA:
Let's test it out by opening a new command window and run
This is going to "watch" the performance of the GPU right here in the Terminal and keep refreshing it every 5 seconds.
Run llama2 and ask a question
Open WebUI
This will be running inside a Docker container, so you will need Docker installed.
Run the commands below:
- # Add Docker's official GPG key:
- sudo apt-get update
- sudo apt-get install ca-certificates curl
- sudo install -m 0755 -d /etc/apt/keyrings
- sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
- sudo chmod a+r /etc/apt/keyrings/docker.asc
Execute the following commands
- # Add the repository to Apt sources:
- echo \
- "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
- $(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
- sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
- sudo apt-get update
Install Docker
- #Install Docker
- sudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
Run Open WebUI Docker Container
Now with Docker installed we'll use it to deploy our open web UI container.
This Docker command is going to pull an image to run the container from OpenWebUI. It is looking at your local computer for the llama base URL, because it is going to integrate and use Llama, and it's going to be using the host network adapter to make things nice and easy. Keep in mind this will use Port8080 on whatever system you are using.
- sudo docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main
Run the following to verify if Docker is installed
Start Using OpenWebAI on the Virtual Machine
Log in to WebUI in the browser by typing localhost:8080 and press enter. Now for the first time you run it, you want to click on Sign up at the bottom and add your details. This log info is only pertinent to this local instance.
The first account you log in with or sign up with, will automatically become an admin account.
As a first time user logging in, you get the power.
Select the model(s)
By default llama2 should be available, that is how we know also our connection is working.
We can download more models by g
oing to https://ollama.com/library and see what they have available. Look at their models to see their list of models. codegemma is a big one, let's try that.
Go back to our command line and type in
Let's test it out and see the GPU being used
And just like that we have our Chat GPT that is completely local.
Let's bring this to another level!
On the same workspace, one virtual machine (VM) running Windows or Linux can use a large language model running on another VM, also with Windows or Linux. This setup allows sharing the language model across different virtual environments.
Exploring Additional Possibilities
- Upload Your Own Files and Pictures
- Since it's running on your own system, you can easily upload any files or pictures you want without sharing your data.
- Control Who Can Use It
- As the admin, you get to decide who can sign up and use your local AI server. You can turn signups on or off, and even make it so new users have to be approved by you before they can access it. This is great for keeping things private or monitoring what your kids are doing.
- Choose Which AI Models Are Allowed
- You can pick which specific AI models are allowed to be used. For example, if you want to limit what kinds of responses the invited users see, you can only allow certain "user-friendly" models.
- Make Your Own Custom AI Models
- The really cool part is that you can actually create your own custom AI models from scratch. You decide what knowledge to put into them and what rules they follow. This lets you build AI assistants that are perfect for your specific needs.Overall, having a local GPT gives you tons of control and flexibility. You can make it as open or as locked down as you want, and even build entirely new AI assistants tailored just for you.
Related Articles
AI, LLMs, GPTs and myDRE
When using AI, LLMs, and GPTs, especially when using external services, consult your CISO, DPO, and Legal. At the very least go over all the fine prints! AI, LLMs, and GPTs will most likely, or even already are, part of processing data in general. ...
Julius AI
What is Julius AI? Julius.ai is an AI-powered data analysis platform that simplifies complex data analysis and visualization tasks through a conversational interface, making it accessible to non-technical users. Please read Julius.ai privacy policy ...
Perplexity.ai
When using AI, LLMs, and GPTs, especially when using external services, consult your CISO, DPO, and Legal. At the very least go over all the fine prints! More information on https://www.perplexity.ai/ To be used as an external service. If a ...
GPT4All
A free-to-use, locally running, privacy-aware chatbot. No GPU or internet required (internet only required for initial installation, can be closed after). Introduction GPT4All is an ecosystem to train and deploy powerful and customized large language ...
Federated Learning on myDRE with VANTAGE6
VANTAGE6 An open source privacy preserving Federated Learning Infrastructure for Secure Insight Exchange. ARCHITECTURE VANTAGE6 uses a client-server model, which is shown in the figure below. In this scenario, the researcher can pose a question and ...