-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Eval bug: llama-server stopped working after PR #11285 got merged #11335
Comments
Can you send the request via From what I saw on your log, it responses with status code |
I'm not able to reproduce the bug on either my laptop (macbook M3) or on server (linux - nvidia T4) |
I did more tests but still can't reproduce the issue. Please provide more info about your setup:
Also from your log, |
Thanks for the hint. I reverted f30f099 and only applied the httplib 0.18.5 change. http://localhost:8080/?something=%0A Which explains why only I saw it, I probably still had some q= or m= arg left over from playing with #11150 (btw, I'd appreciate answers to the two questions I posed in that PR if you could afford the time). Which means this httplib change breaks multi-line use cases of #11150, do you have any idea why the newer httplib would break with URL encoded newlines? |
OK I think I pinpointed the problem. It only happens on
|
Related to yhirose/cpp-httplib#2028 |
Name and Version
llama-server f30f099
Operating systems
Linux
GGML backends
CUDA
Hardware
RTX 4090, CUDA
Models
E.g. Code Qwen 2.5 7B-Chat (Q8)
Problem description & steps to reproduce
llama-server stopped generating any tokens for me, regardless of model, starting with commit f30f099 from #11285.
Simply reverting the above commit, e.g. on top of todays master (6171c9d) does fix the issue for me.
To reproduce, goto http://localhost:8080, enter a question hit return, nothing happens.
First Bad Commit
f30f099
Relevant log output
The text was updated successfully, but these errors were encountered: