Skip to content

Commit

Permalink
Revert "Update batching explanation in docs (#36)"
Browse files Browse the repository at this point in the history
This reverts commit a812e25.
  • Loading branch information
pvijayakrish committed Aug 13, 2024
1 parent a812e25 commit 0ec42e3
Show file tree
Hide file tree
Showing 2 changed files with 0 additions and 19 deletions.
7 changes: 0 additions & 7 deletions genai-perf/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -335,13 +335,6 @@ You can optionally set additional model inputs with the following option:
model with a singular value, such as `stream:true` or `max_tokens:5`. This
flag can be repeated to supply multiple extra inputs.

For [Large Language Models](docs/tutorial.md), there is no batch size (i.e.
batch size is always `1`). Each request includes the inputs for one individual
inference. Other modes such as the [embeddings](docs/embeddings.md) and
[rankings](docs/rankings.md) endpoints support client-side batching, where
`--batch-size N` means that each request sent will include the inputs for `N`
separate inferences, allowing them to be processed together.

</br>

<!--
Expand Down
12 changes: 0 additions & 12 deletions genai-perf/docs/embeddings.md
Original file line number Diff line number Diff line change
Expand Up @@ -68,18 +68,6 @@ genai-perf profile \
--input-file embeddings.jsonl
```

* `-m intfloat/e5-mistral-7b-instruct` is to specify what model you want to run
(`intfloat/e5-mistral-7b-instruct`)
* `--service-kind openai` is to specify that the server type is OpenAI-API
compatible
* `--endpoint-type embeddings` is to specify that the sent requests should be
formatted to follow the [embeddings
API](https://platform.openai.com/docs/api-reference/embeddings/create)
* `--batch-size 2` is to specify that each request will contain the inputs for 2
individual inferences, making a batch size of 2
* `--input-file embeddings.jsonl` is to specify the input data to be used for
inferencing

This will use default values for optional arguments. You can also pass in
additional arguments with the `--extra-inputs` [flag](../README.md#input-options).
For example, you could use this command:
Expand Down

0 comments on commit 0ec42e3

Please sign in to comment.