Merge pull request #3 from IlyasMoutawwakil/refoctor+syncio

refacored + syncio
IlyasMoutawwakil · Mar 5, 2024 · 40c9f1b · 40c9f1b
2 parents 6076177 + c591562
commit 40c9f1b
Show file tree

Hide file tree

Showing 17 changed files with 417 additions and 505 deletions.
diff --git a/.github/workflows/quality.yaml b/.github/workflows/quality.yaml
@@ -0,0 +1,34 @@
+name: quality
+
+on:
+  push:
+    branches:
+      - main
+  pull_request:
+    branches:
+      - main
+
+concurrency:
+  group: ${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}
+  cancel-in-progress: true
+
+jobs:
+  check_quality:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Checkout code
+        uses: actions/checkout@v3
+
+      - name: Set up Python 3.10
+        uses: actions/setup-python@v3
+        with:
+          python-version: "3.10"
+
+      - name: Install quality requirements
+        run: |
+          pip install --upgrade pip
+          pip install -e .[quality]
+
+      - name: Check quality
+        run: |
+          make quality
diff --git a/.github/workflows/release.yaml b/.github/workflows/release.yaml
@@ -5,7 +5,7 @@ on:
     types: [created]
 
 jobs:
-  deploy:
+  release:
     runs-on: ubuntu-latest
     steps:
       - name: Checkout code

diff --git a/.github/workflows/tests.yaml → .github/workflows/test.yaml b/.github/workflows/tests.yaml → .github/workflows/test.yaml
@@ -1,4 +1,4 @@
-name: tests
+name: test
 
 on:
   push:
@@ -24,10 +24,11 @@ jobs:
         with:
           python-version: "3.10"
 
-      - name: Install requirements
+      - name: Install testing requirements
         run: |
           pip install --upgrade pip
-          pip install -e .
+          pip install -e .[testing]
 
       - name: Run test
-        run: python tests/test.py
+        run: |
+          make test
diff --git a/Makefile b/Makefile
@@ -10,7 +10,7 @@ style:
 	ruff check --fix .
 
 test:
-	python tests/test.py
+	pytest tests/ -x
 
 install:
 	pip install -e .
diff --git a/README.md b/README.md
@@ -1,51 +1,44 @@
-# Py-TGI (Py-TXI at this point xD)
+# Py-TXI (previously Py-TGI)
 
-[![PyPI version](https://badge.fury.io/py/py-tgi.svg)](https://badge.fury.io/py/py-tgi)
-[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/py-tgi)](https://pypi.org/project/py-tgi/)
-[![PyPI - Format](https://img.shields.io/pypi/format/py-tgi)](https://pypi.org/project/py-tgi/)
-[![Downloads](https://pepy.tech/badge/py-tgi)](https://pepy.tech/project/py-tgi)
-[![PyPI - License](https://img.shields.io/pypi/l/py-tgi)](https://pypi.org/project/py-tgi/)
-[![Tests](https://github.com/IlyasMoutawwakil/py-tgi/actions/workflows/tests.yaml/badge.svg)](https://github.com/IlyasMoutawwakil/py-tgi/actions/workflows/tests.yaml)
+[![PyPI version](https://badge.fury.io/py/py-txi.svg)](https://badge.fury.io/py/py-txi)
+[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/py-txi)](https://pypi.org/project/py-txi/)
+[![PyPI - Format](https://img.shields.io/pypi/format/py-txi)](https://pypi.org/project/py-txi/)
+[![Downloads](https://pepy.tech/badge/py-txi)](https://pepy.tech/project/py-txi)
+[![PyPI - License](https://img.shields.io/pypi/l/py-txi)](https://pypi.org/project/py-txi/)
+[![Tests](https://github.com/IlyasMoutawwakil/py-txi/actions/workflows/tests.yaml/badge.svg)](https://github.com/IlyasMoutawwakil/py-txi/actions/workflows/tests.yaml)
 
-Py-TGI is a Python wrapper around [Text-Generation-Inference](https://github.com/huggingface/text-generation-inference) and [Text-Embedding-Inference](https://github.com/huggingface/text-embeddings-inference) that enables creating and running TGI/TEI instances through the awesome `docker-py` in a similar style to Transformers API.
+Py-TXI is a Python wrapper around [Text-Generation-Inference](https://github.com/huggingface/text-generation-inference) and [Text-Embedding-Inference](https://github.com/huggingface/text-embeddings-inference) that enables creating and running TGI/TEI instances through the awesome `docker-py` in a similar style to Transformers API.
 
 ## Installation
 
 ```bash
-pip install py-tgi
+pip install py-txi
 ```
 
-Py-TGI is designed to be used in a similar way to Transformers API. We use `docker-py` (instead of a dirty `subprocess` solution) so that the containers you run are linked to the main process and are stopped automatically when your code finishes or fails.
+Py-TXI is designed to be used in a similar way to Transformers API. We use `docker-py` (instead of a dirty `subprocess` solution) so that the containers you run are linked to the main process and are stopped automatically when your code finishes or fails.
 
 ## Usage
 
 Here's an example of how to use it:
 
 ```python
-from py_tgi import TGI, is_nvidia_system, is_rocm_system
+from py_txi import TGI, is_nvidia_system, is_rocm_system
 
-llm = TGI(
-    model="NousResearch/Llama-2-7b-hf",
-    devices=["/dev/kfd", "/dev/dri"] if is_rocm_system() else None,
-    gpus="all" if is_nvidia_system() else None,
-)
+llm = TGI(config=TGIConfig(sharded="false"))
 output = llm.generate(["Hi, I'm a language model", "I'm fine, how are you?"])
-print(output)
+print("LLM:", output)
+llm.close()
 ```
 
-Output: ```[" and I'm here to help you with any questions you have. What can I help you with", "\nUser 0: I'm doing well, thanks for asking. I'm just a"]```
+Output: ```LLM: ["er. I'm a language modeler. I'm a language modeler. I'm a language", " I'm fine, how are you? I'm fine, how are you? I'm fine,"]```
 
 ```python
-from py_tgi import TEI, is_nvidia_system
-
-embed = TEI(
-    model="BAAI/bge-large-en-v1.5",
-    dtype="float16",
-    pooling="mean",
-    gpus="all" if is_nvidia_system() else None,
-)
+from py_txi import TEI, is_nvidia_system
+
+embed = TEI(config=TEIConfig(pooling="cls"))
 output = embed.encode(["Hi, I'm an embedding model", "I'm fine, how are you?"])
-print(output)
+print("Embed:", output)
+embed.close()
 ```
 
 Output: ```[array([[ 0.01058742, -0.01588806, -0.03487622, ..., -0.01613717,

diff --git a/example.py b/example.py
@@ -1,20 +1,12 @@
-from py_tgi import TEI, TGI, is_nvidia_system, is_rocm_system
-
-if is_nvidia_system():
-    llm = TGI(model="NousResearch/Llama-2-7b-hf", gpus="all", port=1234)
-elif is_rocm_system():
-    llm = TGI(model="NousResearch/Llama-2-7b-hf", devices=["/dev/kfd", "/dev/dri"], port=1234)
-else:
-    llm = TGI(model="NousResearch/Llama-2-7b-hf", port=1234)
+from py_txi.text_embedding_inference import TEI, TEIConfig
+from py_txi.text_generation_inference import TGI, TGIConfig
 
+embed = TEI(config=TEIConfig(pooling="cls"))
+output = embed.encode(["Hi, I'm an embedding model", "I'm fine, how are you?"])
+print("Embed:", output)
+embed.close()
 
+llm = TGI(config=TGIConfig(sharded="false"))
 output = llm.generate(["Hi, I'm a language model", "I'm fine, how are you?"])
 print("LLM:", output)
-
-if is_nvidia_system():
-    embed = TEI(model="BAAI/bge-large-en-v1.5", dtype="float16", pooling="mean", gpus="all", port=4321)
-else:
-    embed = TEI(model="BAAI/bge-large-en-v1.5", dtype="float16", pooling="mean", port=4321)
-
-output = embed.encode(["Hi, I'm an embedding model", "I'm fine, how are you?"])
-print("Embed:", output)
+llm.close()
diff --git a/py_tgi/inference_server.py b/py_tgi/inference_server.py