Update on the development branch #1359
Shixiaowei02
announced in
Announcements
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
The TensorRT-LLM team is pleased to announce that we are pushing an update to the development branch (and the Triton backend) this March 26, 2024.
This update includes:
executor
C++ API, seeexamples/bindings/README.md
examples/high-level-api/README.md
for guideQuantConfig
used intrtllm-build
tool, support broader quantization featuresSamplingConfig
used inLLM.generate
orLLM.generate_async
APIs, with the support of beam search, a variety of penalties, and more featuresLLM(streaming_llm=...)
ModelRunnerCpp
does not transferSamplingConfig
tensor fields correctlyModelRunnerCpp
does not transferSamplingConfig
Tensor fields correctly #1183gptManagerBenchmark
Thanks,
The TensorRT-LLM Engineering Team
Beta Was this translation helpful? Give feedback.
All reactions