EvolvingLMMs-Lab · kcz358 · Aug 7, 2024 · Aug 7, 2024 · Aug 7, 2024 · Aug 7, 2024
diff --git a/.gitignore b/.gitignore
@@ -38,3 +38,4 @@ llava-video/
 Video-MME/
 VATEX/
 lmms_eval/tasks/vatex/__pycache__/utils.cpython-310.pyc
+lmms_eval/tasks/mlvu/__pycache__/utils.cpython-310.pyc
diff --git a/docs/commands.md b/docs/commands.md
@@ -22,3 +22,33 @@ This mode supports a number of command-line arguments, the details of which can
 
 * `--limit` : Accepts an integer, or a float between 0.0 and 1.0 . If passed, will limit the number of documents to evaluate to the first X documents (if an integer) per task or first X% of documents per task. Useful for debugging, especially on costly API models.
 
+## Usage with SRT API
+
+> install sglang
+
+```bash
+git clone https://github.com/EvolvingLMMs-Lab/sglang.git
+cd sglang;
+git checkout dev/onevision;
+pip install -e "python[srt]"
+```
+
+> run sglang backend service with the following command
+
+```bash
+# backend service
+python -m sglang.launch_server --model-path "\path\to\onevision" --tokenizer-path lmms-lab/llavanext-qwen-siglip-tokenizer --port=30000 --host=127.0.0.1 --tp-size=8 --chat-template=chatml-llava
+
+# launch lmms-eval srt_api model
+python -m accelerate.commands.launch --main_process_port=12580 --num_processes=1 lmms_eval --model=srt_api --model_args=modality=image,host=127.0.0.1,port=30000 --tasks=ai2d --batch_size=1 --log_samples --log_samples_suffix=debug --output_path=./logs/ --verbosity=DEBUG
+```
+
+You may need to install some dependencies for the above command to work (if you encounter some errors).
+
+```bash
+pip install httpx==0.23.3
+pip install protobuf==3.20
+pip install flashinfer -i https://flashinfer.ai/whl/cu121/torch2.3/
+```
+
+