LLMs server🔒

you like ollama, and llama.cpp, you love LLMs Server decentralized (local) llms to run in your computer privately and host ALL your AI locally.

1- Building Decentralized LLMs Server:

first download Ollama on your computer:
- choose the OS version you want to download:
  For Windows just click on download (the preview)
  
  For Linux write this command in your terminal: in my case I am using wsl in Windows so, I can use the Linux command in the Windows terminal:
```
curl -fsSL https://ollama.com/install.sh | sh
```
  For MacOS, click on download
- then after downloading the Ollama, run one of the following commands in your terminal, depending on the LLM you want:
```
ollama pull llama3
```
```
ollama pull Mistral
```
  then run the model to check it out:
```
ollama run llama3
```
  to check if the ollama works in your computer, open your Chrome browser, and type in the search bar: localhost:11434 it should be shown like this: Ollama is running

Second connect the llama3(llama-3-8.0B) to a web GUI:

first download the docker, by typing the following commands:
INSTALL DOCKER

# Add Docker's official GPG key:
sudo apt-get update
sudo apt-get install ca-certificates curl
sudo install -m 0755 -d /etc/apt/keyrings
sudo curl -fsSL https://download.docker.com/linux/ubuntu/gpg -o /etc/apt/keyrings/docker.asc
sudo chmod a+r /etc/apt/keyrings/docker.asc

# Add the repository to Apt sources:
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/ubuntu \
$(. /etc/os-release && echo "$VERSION_CODENAME") stable" | \
sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt-get update

sudo apt install docker.io

or

#Install Dockersudo apt-get install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin

RUN OPEN WEBUI DOCKER CONTAINER

sudo docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main

RUN THE WEBUI WTH LLAMA BY TYPING IN THE SEARCH BAR LOCALHOST:8080

Installing Open WebUI with Bundled Ollama Support

This installation method uses a single container image that bundles Open WebUI with Ollama, allowing for a streamlined setup via a single command. Choose the appropriate command based on your hardware setup:

With GPU Support: Utilize GPU resources by running the following command:

docker run -d -p 3000:8080 --gpus=all -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama

For CPU Only: If you're not using a GPU, use this command instead:

docker run -d -p 3000:8080 -v ollama:/root/.ollama -v open-webui:/app/backend/data --name open-webui --restart always ghcr.io/open-webui/open-webui:ollama

second, now open the localhost:8080 or localhost:3000, then choose the model you want to chat with.
now you should be able to chat with for example llama3 with a GUI locally on your computer.

Note: if the docker did not run in your computer after the installation, you might need to start manually from the terminal:
```
sudo dockerd
```
and also for the Ollama, and it may show an error like this one: Error: could not connect to Ollama app, is it running?, to solve this run this command:
```
ollama serve 
```

Example of Docker Configuration for Open WebUI and Ollama

 version: '3.8'
 
 services:
   ollama:
     image: ollama/ollama:latest
     container_name: ollama
     volumes:
       - ollama-data:/root/.ollama
     networks:
       - webui-network
     ports:
       - "11434:11434"
 
   webui:
     image: ghcr.io/open-webui/open-webui:latest
     container_name: open-webui
     volumes:
       - webui-data:/app/backend/data
     environment:
       - OLLAMA_BASE_URL=http://ollama:11434
     networks:
       - webui-network
     ports:
       - "8080:8080"
 
 volumes:
   ollama-data:
   webui-data:
 
 networks:
   webui-network:

In this configuration:

Ollama is set up to expose port 11434. Open WebUI is configured to interact with Ollama using its internal Docker network address (http://ollama:11434). Both services share the webui-network, allowing them to communicate securely. Port 8080 on the host is mapped to port 8080 in the open-webui container for web access.

to test ollama in your GPU, type in your terminal: nvidia-smi

2- Xllama:

Xllama: Xllama is an advanced language model framework, inspired by the original Llama model but enhanced with additional features such as Grouped Query Attention (GQA), Multi-Head Attention (MHA), and more. This project aims to provide a flexible and extensible platform for experimenting with various attention mechanisms and building state-of-the-art natural language processing models.

install the requirements libraries:
```
pip install requirements
```
or
```
pip install pytorch transformers
```
clone the repo
```
git clone https://github.com/Esmail-ibraheem/X-Llama.git
```
run the download shell file to download the llama 2 weights
```
.\download.sh
```
after downloading the weights, run the inference code:
```
python inference.py
```

Citation:

@misc{Gumaan2024-llms-server,
  title   = "llms-server",
  author  = "Gumaan, Esmail",
  howpublished = {\url{https://github.com/Esmail-ibraheem/llms-server}},
  year    = "2024",
  month   = "July",
  note    = "[Online; accessed 2024-06-23]",
}

Notes:

in this repo, I tried to simplify how to run Ollama in your computer locally, and use WebUI with docker to interact with llama not just from the terminal, but also interact with it in the web UI. just follow the instructions I provided, and you should be able to use an LLM on your computer locally without an internet connection, for future reference, I will make my models and upload them here so that you can use my models just like Ollama here. now you can access the pre-built models from Ollama depending on the model you download, but in the future, I will add my models to the webUI, and you can also chat with it from the terminal (I am currently working on it). to read more about the WebUI, docker, and Ollama. see my article on Medium: Build your own llms server.

Name		Name	Last commit message	Last commit date
Latest commit History 114 Commits
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
compose.yaml		compose.yaml
llms.drawio.png		llms.drawio.png
ollama.jpg		ollama.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLMs server🔒

1- Building Decentralized LLMs Server:

Installing Open WebUI with Bundled Ollama Support

2- Xllama:

Citation:

Notes:

About

Releases 1

Packages

Languages

License

Esmail-ibraheem/llms-server

Folders and files

Latest commit

History

Repository files navigation

LLMs server🔒

1- Building Decentralized LLMs Server:

Installing Open WebUI with Bundled Ollama Support

2- Xllama:

Citation:

Notes:

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages