This repository contains the middleware for the Chatur project.
There are some Terraform templates in the deploy
subdirectory that can be used to create a fresh installation.
If you'd like to install everything manually, you can follow these steps:
Any Kubernetes cluster will work fine. We've been using k3s because it's easy to deploy and will automatically detect and configure support for GPUs. We also have an Ansible role that can be used to deploy k3s clusters easily.
Note that the Kubernetes cluster should have GPUs installed. The LLM requests will work without a GPU, but the response times will be much slower.
The NGINX Ingress Controller is used to provide access to the cluster and to manage TLS. You can find instructions for installing it in their Getting Started Guide or you can run these commands (assuming you have helm installed):
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm install nginx-ingress ingress-nginx/ingress-nginx
We're currently using cert-manager to manage certificates signed by Let's Encrypt. The certificate management
itself is defined in the file k8s/cert-issuer.yaml
. You may have to edit some of the details in this file if you're
deploying this software outside of CyVerse. You can use the cert-manager installation instructions to get started
or you can follow these instructions:
helm repo add jetstack https://charts.jetstack.io
helm repo update
kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.3/cert-manager.crds.yaml
helm install \
cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--version v1.13.3
We're using Apache APISIX to provide OIDC integration and to easily expose endpoints. You can find installation instructions in the APISIX documentation or you can follow these instructions:
helm repo add apisix https://charts.apiseven.com
helm repo update
helm install \
apisix apisix/apisix
--create-namespace --namespace apisix \
--set etcd.replicaCount=1 \
--set 'apisix.admin.allow.ipList={0.0.0.0/0}'
It may be necessary to edit some of the settings in the Kubernetes manifests, so it will be necessary to review all of
the YAML files in the k8s directory to verify that the configuration settings are correct. Some things that you may have
to change are configuration options, host names, and locations of files in the data store. Once all of the prerequisites
are in place, you can deploy the services by running kubectl apply -f k8s
.