Update README.md

IlyasMoutawwakil · Feb 22, 2024 · a87caf2 · a87caf2
1 parent 4a0e7d4
commit a87caf2
Showing 1 changed file with 6 additions and 5 deletions.
diff --git a/README.md b/README.md
@@ -1,6 +1,6 @@
 # Py-TGI
 
-Py-TGI is a Python wrapper around [TGI](https://github.com/huggingface/text-generation-inference) to enable creating and running TGI servers in a similar style to vLLM.
+Py-TGI is a Python wrapper around [Text-Generation-Inference](https://github.com/huggingface/text-generation-inference) that enables creating and running TGI instances through the awesome `docker-py` in a similar style to Transformers API.
 
 ## Installation
 
@@ -10,17 +10,18 @@ pip install py-tgi
 
 ## Usage
 
-Py-TGI is designed to be used in a similar way to vLLM. Here's an example of how to use it:
+Py-TGI is designed to be used in a similar way to Transformers API. We use `docker-py` (instead of a dirty `subprocess` solution) so that the containers you run are linked to the main process and are stopped automatically when your code finishes or fails.
+Here's an example of how to use it:
 
 ```python
 from py_tgi import TGI
 from py_tgi.utils import is_nvidia_system, is_rocm_system
 
 llm = TGI(
     model="TheBloke/Llama-2-7B-AWQ",  # awq model checkpoint
-    devices=["/dev/kfd", "/dev/dri"] if is_rocm_system() else None,  # custom devices (ROCm)
-    gpus="all" if is_nvidia_system() else None,  # all gpus (NVIDIA)
-    quantize="gptq",  # use exllama kernels (rocm compatible)
+    quantize="gptq",  # use exllama kernels (awq compatible)
+    devices=["/dev/kfd", "/dev/dri"] if is_rocm_system() else None,
+    gpus="all" if is_nvidia_system() else None,
 )
 output = llm.generate(["Hi, I'm a language model", "I'm fine, how are you?"])
 print(output)