Skip to content

Commit

Permalink
Remove avg T2T latency sample output from prefill benchmark
Browse files Browse the repository at this point in the history
  • Loading branch information
nv-hwoo committed Oct 4, 2023
1 parent c2dd174 commit b9cb7fc
Showing 1 changed file with 3 additions and 3 deletions.
6 changes: 3 additions & 3 deletions src/c++/perf_analyzer/docs/llm.md
Original file line number Diff line number Diff line change
Expand Up @@ -73,9 +73,9 @@ python profile.py -m vllm --prompt-size-range 100 500 200 --max-tokens 1

# Sample output
# [ Benchmark Summary ]
# Prompt size: 100, Average first-token latency: 0.0459 sec, Average token-token latency: 0.0007 sec
# Prompt size: 300, Average first-token latency: 0.0415 sec, Average token-token latency: 0.0007 sec
# Prompt size: 500, Average first-token latency: 0.0451 sec, Average token-token latency: 0.0006 sec
# Prompt size: 100, Average first-token latency: 0.0459 sec
# Prompt size: 300, Average first-token latency: 0.0415 sec
# Prompt size: 500, Average first-token latency: 0.0451 sec
```

> **Note**
Expand Down

0 comments on commit b9cb7fc

Please sign in to comment.