Skip to content

Commit

Permalink
Update Popular_Models_Guide/Llama2/trtllm_guide.md
Browse files Browse the repository at this point in the history
Co-authored-by: Hyunjae Woo <[email protected]>
  • Loading branch information
jbkyang-nvi and nv-hwoo authored Nov 8, 2023
1 parent c13312f commit 9dfd3fd
Showing 1 changed file with 6 additions and 1 deletion.
7 changes: 6 additions & 1 deletion Popular_Models_Guide/Llama2/trtllm_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -136,7 +136,12 @@ You can test the results of the run with:
```bash
# Using the SDK container as an example
docker run --rm -it --net host --shm-size=2g --ulimit memlock=-1 --ulimit stack=67108864 --gpus all -v /path/to/tensorrtllm_backend:/tensorrtllm_backend -v /path/to/Llama2/repo:/Llama-2-7b-hf -v /path/to/engines:/engines nvcr.io/nvidia/tritonserver:23.10-py3-sdk
docker run --rm -it --net host --shm-size=2g \
--ulimit memlock=-1 --ulimit stack=67108864 --gpus all \
-v /path/to/tensorrtllm_backend:/tensorrtllm_backend \
-v /path/to/Llama2/repo:/Llama-2-7b-hf \
-v /path/to/engines:/engines \
nvcr.io/nvidia/tritonserver:23.10-py3-sdk
# install extra dependencies for the script
pip3 install transformers sentencepiece
python3 /tensorrtllm_backend/inflight_batcher_llm/client/inflight_batcher_llm_client.py --request-output-len 200 --tokenizer_type llama --tokenizer_dir /Llama-2-7b-hf
Expand Down

0 comments on commit 9dfd3fd

Please sign in to comment.