Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate poor performance on large prompts #58

Open
ccmaymay opened this issue Aug 15, 2022 · 0 comments
Open

Investigate poor performance on large prompts #58

ccmaymay opened this issue Aug 15, 2022 · 0 comments
Labels
performance Performance issues or improvements

Comments

@ccmaymay
Copy link
Collaborator

ccmaymay commented Aug 15, 2022

From @nweir127

i am calling it on a prompt that is 600-1000 tokens long and giving it max_new_tokens of 36

i am calling

model.complete(text, stop_strings=['QUESTION'], top_p=0.8, num_return_sequences=1)

Taking 100+ seconds per call on 8 GPUs

@ccmaymay ccmaymay added the bug Something isn't working label Aug 15, 2022
@ccmaymay ccmaymay added performance Performance issues or improvements and removed bug Something isn't working labels Mar 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Performance issues or improvements
Projects
None yet
Development

No branches or pull requests

1 participant