-
Notifications
You must be signed in to change notification settings - Fork 104
Issues: triton-inference-server/tensorrtllm_backend
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Support non-detached mode for python trtllm backend
bug
Something isn't working
#639
opened Nov 6, 2024 by
ShuaiShao93
4 tasks
the output of bls is unstable
bug
Something isn't working
#630
opened Oct 23, 2024 by
dwq370
4 tasks
Streaming Inference Failure
bug
Something isn't working
#626
opened Oct 20, 2024 by
imilli
2 of 4 tasks
The GPU memory usage is too high.
bug
Something isn't working
#625
opened Oct 19, 2024 by
imilli
2 of 4 tasks
Garbage response when input tokens is longer than 4096 on Llama-3.1-8B-Instruct
bug
Something isn't working
#624
opened Oct 18, 2024 by
winstxnhdw
2 of 4 tasks
Failed install in nvcr.io/nvidia/tritonserver:24.08-trtllm-python-py3
bug
Something isn't working
#623
opened Oct 18, 2024 by
wwx007121
4 tasks
Stark Difference in GPU Usage of Triton Servers with Llama3 and Llama3.1 models
#636
opened Oct 14, 2024 by
jasonngap1
fill_template.py and gpu_device_ids
bug
Something isn't working
#616
opened Oct 12, 2024 by
Alireza3242
2 of 4 tasks
Support dynamic path for gpt_model_path and token_dir based on Triton model repo
#615
opened Oct 11, 2024 by
rahchuenmonroe
An error that Something isn't working
Shape does not match true shape of 'data' field
occurs when using tensorrt_llm model alone in inflight_batcher_llm
bug
#613
opened Oct 10, 2024 by
junstar92
1 of 4 tasks
Is ReDrafter supported by the TensorRT-LLM backend?
bug
Something isn't working
#610
opened Oct 5, 2024 by
vkc1vk
2 of 4 tasks
Bad quality in answers (repetition, non stop...) when using Llama3.1-8B-Instruct and Triton
bug
Something isn't working
#603
opened Sep 25, 2024 by
alvaroalfaro612
2 of 4 tasks
generation logits dtype bug
bug
Something isn't working
#598
opened Sep 11, 2024 by
binhtranmcs
2 of 4 tasks
request is blocked and non output when using tensor parallelism with multi gpus
bug
Something isn't working
#596
opened Sep 9, 2024 by
dwq370
4 tasks
Is Something isn't working
no_repeat_ngram_size
generation option supported?
bug
#593
opened Sep 3, 2024 by
ghost
2 of 4 tasks
Error malloc(): unaligned tcache chunk detected Always Occur after tensorrt server handling a certain amount requests
bug
Something isn't working
#587
opened Aug 28, 2024 by
wangpeilin
2 of 4 tasks
Previous Next
ProTip!
Exclude everything labeled
bug
with -label:bug.