You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
TensorRT-LLM branch or tag: deepseek(0.17.0.dev2024121700)
NVIDIA driver version: 560.35.03
OS: Ubuntu 22.04 LTS
Who can help?
No response
Information
The official example scripts
My own modified scripts
Tasks
An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)
Reproduction
1.Build whisper engine using the official example
2.Run the examples/run.py
Expected behavior
None
actual behavior
test case:input=512 ouput=512
bs=1 total_time:22.55248665s
bs=2 total_time:29.60423994s
bs=4 total_time:132.4225805s
When bs is 4, the performance deteriorates greatly.
additional notes
Plans to increase throughput of deepseek v3
The text was updated successfully, but these errors were encountered:
System Info
CPU architecture: x86_64
CPU/Host memory size: 256GB
GPU name: NVIDIA H100
Libraries
TensorRT-LLM branch or tag: deepseek(0.17.0.dev2024121700)
NVIDIA driver version: 560.35.03
OS: Ubuntu 22.04 LTS
Who can help?
No response
Information
Tasks
examples
folder (such as GLUE/SQuAD, ...)Reproduction
1.Build whisper engine using the official example
2.Run the examples/run.py
Expected behavior
None
actual behavior
test case:input=512 ouput=512
bs=1 total_time:22.55248665s
bs=2 total_time:29.60423994s
bs=4 total_time:132.4225805s
When bs is 4, the performance deteriorates greatly.
additional notes
Plans to increase throughput of deepseek v3
The text was updated successfully, but these errors were encountered: