Skip to content

Commit

Permalink
Merge pull request #22 from bmd1905/bmd1905/integrate-openwebui
Browse files Browse the repository at this point in the history
bmd1905/integrate-openwebui
  • Loading branch information
bmd1905 authored Aug 26, 2024
2 parents 5700eb6 + 1077bca commit b4bfbeb
Show file tree
Hide file tree
Showing 64 changed files with 11,561 additions and 68 deletions.
3 changes: 3 additions & 0 deletions .gitmodules
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
[submodule "open-webui"]
path = open-webui
url = https://github.com/bmd1905/open-webui
268 changes: 200 additions & 68 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,163 +13,295 @@

## Target Audience: Developers

**PromptAlchemy** is a powerful, open-source library designed to transform basic user prompts into sophisticated prompts capable of unlocking the full potential of language models. By leveraging advanced techniques like Chain-of-Thought, Few-Shot Learning, and more, PromptAlchemy empowers developers and researchers to build more intelligent and effective applications.
This project integrates the [Open WebUI](https://github.com/open-webui/open-webui) as the backend and frontend for a machine learning operations (MLOps) environment. It includes custom-built infrastructure such as Jenkins CI/CD pipelines, Kubernetes for orchestration, deployments on Google Kubernetes Engine (GKE), etc. The project aims to provide hands-on experience with MLOps, leveraging Open WebUI’s capabilities to manage and deploy large language models (LLMs) in a scalable, cloud-native environment.


## Features
## Key Features of Open WebUI ⭐ (from [@open-webui/open-webui](https://github.com/open-webui/open-webui))

- **Intuitive API**: Easily integrate PromptAlchemy into your projects with a clean and well-documented API
- **Versatile Techniques**: Apply a range of prompt engineering strategies, including:
- Chain-of-Thought (CoT)
- Few-Shot Learning
- Zero-Shot Learning
- Task-specific prompting
- Multi-task prompting
- **Customizable**: Tailor the transformation process to your specific needs.
- **Extensible**: Easily add new prompt engineering techniques as they emerge in the field.
- **Performance Metrics**: Built-in tools to measure and compare the effectiveness of different prompting strategies.
- 🚀 **Effortless Setup**: Install seamlessly using Docker or Kubernetes (kubectl, kustomize or helm) for a hassle-free experience with support for both `:ollama` and `:cuda` tagged images.

- 🤝 **Ollama/OpenAI API Integration**: Effortlessly integrate OpenAI-compatible APIs for versatile conversations alongside Ollama models. Customize the OpenAI API URL to link with **LMStudio, GroqCloud, Mistral, OpenRouter, and more**.

- 🧩 **Pipelines, Open WebUI Plugin Support**: Seamlessly integrate custom logic and Python libraries into Open WebUI using [Pipelines Plugin Framework](https://github.com/open-webui/pipelines). Launch your Pipelines instance, set the OpenAI URL to the Pipelines URL, and explore endless possibilities. [Examples](https://github.com/open-webui/pipelines/tree/main/examples) include **Function Calling**, User **Rate Limiting** to control access, **Usage Monitoring** with tools like Langfuse, **Live Translation with LibreTranslate** for multilingual support, **Toxic Message Filtering** and much more.

- 📱 **Responsive Design**: Enjoy a seamless experience across Desktop PC, Laptop, and Mobile devices.

- 📱 **Progressive Web App (PWA) for Mobile**: Enjoy a native app-like experience on your mobile device with our PWA, providing offline access on localhost and a seamless user interface.

- ✒️🔢 **Full Markdown and LaTeX Support**: Elevate your LLM experience with comprehensive Markdown and LaTeX capabilities for enriched interaction.

- 🎤📹 **Hands-Free Voice/Video Call**: Experience seamless communication with integrated hands-free voice and video call features, allowing for a more dynamic and interactive chat environment.

- 🛠️ **Model Builder**: Easily create Ollama models via the Web UI. Create and add custom characters/agents, customize chat elements, and import models effortlessly through [Open WebUI Community](https://openwebui.com/) integration.

- 🐍 **Native Python Function Calling Tool**: Enhance your LLMs with built-in code editor support in the tools workspace. Bring Your Own Function (BYOF) by simply adding your pure Python functions, enabling seamless integration with LLMs.

- 📚 **Local RAG Integration**: Dive into the future of chat interactions with groundbreaking Retrieval Augmented Generation (RAG) support. This feature seamlessly integrates document interactions into your chat experience. You can load documents directly into the chat or add files to your document library, effortlessly accessing them using the `#` command before a query.

- 🔍 **Web Search for RAG**: Perform web searches using providers like `SearXNG`, `Google PSE`, `Brave Search`, `serpstack`, `serper`, `Serply`, `DuckDuckGo` and `TavilySearch` and inject the results directly into your chat experience.

- 🌐 **Web Browsing Capability**: Seamlessly integrate websites into your chat experience using the `#` command followed by a URL. This feature allows you to incorporate web content directly into your conversations, enhancing the richness and depth of your interactions.

- 🎨 **Image Generation Integration**: Seamlessly incorporate image generation capabilities using options such as AUTOMATIC1111 API or ComfyUI (local), and OpenAI's DALL-E (external), enriching your chat experience with dynamic visual content.

- ⚙️ **Many Models Conversations**: Effortlessly engage with various models simultaneously, harnessing their unique strengths for optimal responses. Enhance your experience by leveraging a diverse set of models in parallel.

- 🔐 **Role-Based Access Control (RBAC)**: Ensure secure access with restricted permissions; only authorized individuals can access your Ollama, and exclusive model creation/pulling rights are reserved for administrators.

- 🌐🌍 **Multilingual Support**: Experience Open WebUI in your preferred language with our internationalization (i18n) support. Join us in expanding our supported languages! We're actively seeking contributors!

- 🌟 **Continuous Updates**: We are committed to improving Open WebUI with regular updates, fixes, and new features.

## Getting Started

### Local Development

**1. Clone the Repository:**

First, you'll need to clone the project's repository from GitHub to your local machine. This will create a copy of the codebase in a directory named `PromptAlchemy`.

## Local Development
### Frontend
```bash
cd ui/promptalchemy-ui
git clone https://github.com/bmd1905/PromptAlchemy.git
cd PromptAlchemy
```

# Build docker image
make build
**2. Backend Setup:**

# Run docker image
make run
To set up the backend, follow these steps:

- **(Optional) Conda Environment:**

It's recommended to use a Conda environment to manage dependencies and avoid conflicts with other Python projects. If you don't have Conda installed, you can install it by following the instructions on the [Anaconda website](https://docs.anaconda.com/anaconda/install/).

```bash
conda create --name open-webui-env python=3.11
conda activate open-webui-env
```

- **Install Dependencies:**

Install the required Python packages using `pip`. The `-r requirements.txt` option ensures all dependencies listed in the `requirements.txt` file are installed. The `-U` flag is used to upgrade packages to the latest version if possible.

```bash
pip install -r requirements.txt -U
```

- **Start the Backend Server:**

After installing the dependencies, you can start the backend server using the provided script. This script will launch the server, making it ready to handle API requests.

```bash
bash start.sh
```

**3. Frontend Setup:**

The frontend of the application is located in the `open-webui` directory. To set it up, navigate to the directory and install the necessary dependencies:

```bash
cd open-webui
npm install
```

### Backend
- **Build and Run the Frontend:**

Once the dependencies are installed, build the frontend assets and start the development server:

```bash
npm run build
npm run dev
```

The development server will host the frontend, allowing you to interact with the application via a web browser.

**4. Configuration:**

To configure the application, you'll need to set up environment variables. The `.env.example` file contains example configurations. Copy this file to `.env` and fill in the required variables, such as API keys for language models.

```bash
cd api
docker build -t bmd1905/promptalchemy_local --platform=linux/amd64 .
docker run -it -p 30000:30000 -p 4000:4000 --env-file .env bmd1905/promptalchemy_local
cp -RPp .env.example .env
```

You can then access:
- FastAPI docs at http://localhost:30000/docs
- LiteLLM docs at http://localhost:4000/docs
Edit the `.env` file with your specific configuration details, ensuring that all required environment variables are set.

## Production Deployment

### Setup Cluster with Terraform
### Using Terraform for Google Kubernetes Engine (GKE)

**1. Set up the Cluster:**

If you're deploying the application to GKE, you can use Terraform to automate the setup of your Kubernetes cluster. Navigate to the `iac/terraform` directory and initialize Terraform:

```bash
cd iac/terraform

terraform init
terraform plan
terraform apply
```

Get cluster info:
**Plan and Apply Configuration:**

Generate an execution plan to verify the resources that Terraform will create or modify, and then apply the configuration to set up the cluster:

```bash
terraform plan
terraform apply
```

**2. Retrieve Cluster Information:**

To interact with your GKE cluster, you'll need to retrieve its configuration. You can view the current cluster configuration with the following command:

```bash
cat ~/.kube/config
```

### Start Service on GKE Manually
Ensure your `kubectl` context is set correctly to manage the cluster.

### Manual Deployment to GKE

For a more hands-on deployment process, follow these steps:

**1. Deploy Nginx Ingress Controller:**

Deploy NGINX-ingress
```shell
The Nginx Ingress Controller manages external access to services in your Kubernetes cluster. Create a namespace and install the Ingress Controller using Helm:

```bash
kubectl create ns nginx-system
kubens nginx-system
helm upgrade --install nginx-ingress ./deployments/nginx-ingress
```

Setup secret for API Key:
**2. Configure API Key Secret:**

Store your environment variables, such as API keys, securely in Kubernetes secrets. Create a namespace for model serving and create a secret from your `.env` file:

```bash
kubectl create ns model-serving
kubens model-serving
kubectl delete secret promptalchemy-env
kubectl create secret generic promptalchemy-env --from-env-file=.env -n model-serving
kubectl describe secret promptalchemy-env -n model-serving
```

cd deployments/promptalchemy
**3. Grant Permissions:**

k create secret generic promptalchemy-env --from-env-file=.env -n model-serving
k describe secret promptalchemy-env -n model-serving
```
Kubernetes resources often require specific permissions. Apply the necessary roles and bindings:

Grant permission
```bash
cd deployments/infrastructure
kubectl apply -f role.yaml
kubectl apply -f rolebinding.yaml
```

Deploy model:
**4. Deploy LiteLLM:**

Deploy the [LiteLLM](https://github.com/BerriAI/litellm) service:

```bash
kubens model-serving
helm upgrade --install promptalchemy ./deployments/promptalchemy --debug --force
helm upgrade --install litellm ./deployments/litellm
```

For more detailed frontend setup instructions, please refer to `ui/promptalchemy-ui/README.md`
**5. Deploy the Open WebUI:**

Next, Deploy the web UI to your GKE cluster:

### Setup Jenkins with Ansible
First create a Google Compute Engine instance named "jenkins-server" running Ubuntu 22.04 with a firewall rule allowing traffic on ports 8081 and 50000 from any source.
```bash
ansible-playbook iac/ansible/deploy_jenkins/create_compute_instance.yaml
cd open-webui
kubens model-serving
kubectl apply -f ./kubernetes/manifest/base
```

Then deploy Jenkins on a server by installing prerequisites, pulling a Docker image, and creating a privileged container with access to the Docker socket and exposed ports 8081 and 50000.
**6. Deploy semantic caching service using Redis:**

Now, deploy the semantic caching service using Redis:
```bash
ansible-playbook -i iac/ansible/inventory iac/ansible/deploy_jenkins/deploy_jenkins.yaml
cd ./deployments/redis
helm dependency build
helm upgrade --install redis .
```

Connect Jenkins UI through external IP address at port 8081: http://<EXTERNAL_IP>:8081
### Continuous Integration/Continuous Deployment (CI/CD) with Jenkins and Ansible

For automated CI/CD pipelines, use Jenkins and Ansible as follows:

Install plugins: `Dashboard` > `Manage Jenkins` > `Plugins` > `Available Plugins` > Search for `Docker`, `Docker Pipeline`, `Kubernetes`, `GCloud SDK`, & `Google Kubernetes Engine` then click `Install`.
**1. Set up Jenkins Server:**

Setup Github repo
Create a Google Compute Engine instance for Jenkins. Ensure it's accessible on the necessary ports:

Add credential for DockerHub
- **Instance Name:** jenkins-server
- **OS:** Ubuntu 22.04
- **Ports:** Allow traffic on 8081 (Jenkins UI) and 50000 (Jenkins agent).

Add credential for GKE cluster
**2. Deploy Jenkins:**

Connect to GKE cluster
Use Ansible to automate the deployment of Jenkins on your instance:

```bash
kubectl create clusterrolebinding cluster-admin-binding --clusterrole=cluster-admin --user=system:anonymous
ansible-playbook -i iac/ansible/inventory iac/ansible/deploy_jenkins/deploy_jenkins.yaml
```

kubectl create clusterrolebinding cluster-admin-default-binding --clusterrole=cluster-admin --user=system:serviceaccount:model-serving:default
**3. Access Jenkins:**

# Test credential
kubectl auth can-i create pods --as=system:serviceaccount:model-serving:default
Once Jenkins is deployed, access it via your browser:

```plaintext
http://<EXTERNAL_IP>:8081
```

### Setup Monitoring
Install dependencies
**4. Install Jenkins Plugins:**

Install the following plugins to integrate Jenkins with Docker, Kubernetes, and GKE:

- Docker
- Docker Pipeline
- Kubernetes
- GCloud SDK
- Google Kubernetes Engine

**5. Configure Jenkins:**

Set up your GitHub repository in Jenkins, and add the necessary credentials for DockerHub and GKE.

### Monitoring with Prometheus

To monitor your deployed application, follow these steps:

**1. Install Dependencies:**

Prometheus requires certain dependencies that can be managed with Helm. Navigate to the monitoring directory and build these dependencies:

```bash
cd deployments/monitoring/kube-prometheus-stack
helm dependency build
```

Then start monitoring
**2. Deploy Prometheus:**

Deploy Prometheus and its associated services using Helm:

```bash
helm upgrade --install -f deployments/monitoring/kube-prometheus-stack.expanded.yaml kube-prometheus-stack deployments/monitoring/kube-prometheus-stack -n monitoring
```

This setup will provide monitoring capabilities for your Kubernetes cluster, ensuring you can track performance and troubleshoot issues.


## 📝 To-Do List

### 🚀 Deployment
- [x] Implement core features (FastAPI + LiteLLM + Redis)
- [x] Implement core features
- [x] Set up CI pipeline (Jenkins)
- [x] IaC (Ansible + Terraform)
- [x] Monitoring (Grafana + Prometheus + Jaeger + Alert)
- [x] Monitoring (Grafana + Prometheus + Alert)
- [x] Caching chatbot responses (Redis)
- [ ] Tracing (Jaeger)
- [ ] Set up CD pipeline (Argo CD)
- [ ] Optimize performance (Batching)

### 📚 Documentation
- [ ] Write user guide
- [ ] Create tutorials and examples

### 🌟 Post-Launch
- [ ] Create tutorials and examples
- [ ] Gather user feedback
- [ ] Implement enhancements
- [ ] Plan for future updates

## Contributing
We welcome contributions to PromptAlchemy! Please see our CONTRIBUTING.md for more information on how to get started.
Expand Down
23 changes: 23 additions & 0 deletions deployments/litellm/.helmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
# Patterns to ignore when building packages.
# This supports shell glob matching, relative path matching, and
# negation (prefixed with !). Only one pattern per line.
.DS_Store
# Common VCS dirs
.git/
.gitignore
.bzr/
.bzrignore
.hg/
.hgignore
.svn/
# Common backup files
*.swp
*.bak
*.tmp
*.orig
*~
# Various IDEs
.project
.idea/
*.tmproj
.vscode/
Loading

0 comments on commit b4bfbeb

Please sign in to comment.