Eval bug: llama-server stopped working after PR #11285 got merged #11335

tim-janik · 2025-01-21T19:50:16Z

Name and Version

llama-server f30f099

Operating systems

Linux

GGML backends

CUDA

Hardware

RTX 4090, CUDA

Models

E.g. Code Qwen 2.5 7B-Chat (Q8)

Problem description & steps to reproduce

llama-server stopped generating any tokens for me, regardless of model, starting with commit f30f099 from #11285.
Simply reverting the above commit, e.g. on top of todays master (6171c9d) does fix the issue for me.

To reproduce, goto http://localhost:8080, enter a question hit return, nothing happens.

First Bad Commit

f30f099

Relevant log output

│ main: server is listening on http://0.0.0.0:8080 - starting the main loop                                                                                                                                                                                 │                
             │ srv  update_slots: all slots are idle                                                                                                                                                                                                                     │                
             │ request: GET / 127.0.0.1 200                                                                                                                                                                                                                              │                
             │ request: GET /favicon.ico  400                                                                                                                                                                                                                            │                
             │ request: POST /v1/chat/completions  400                                                                                                                                                                                                                   │

The text was updated successfully, but these errors were encountered:

ngxson · 2025-01-21T21:45:56Z

Can you send the request via curl to see what is the response? (Or maybe see the response from browser's devtool)

From what I saw on your log, it responses with status code 400, meaning ERROR_TYPE_INVALID_REQUEST. This could potentially be due to upstream changes from httplib library, not a bug from llama-server.

ngxson · 2025-01-21T21:48:12Z

I'm not able to reproduce the bug on either my laptop (macbook M3) or on server (linux - nvidia T4)

ngxson · 2025-01-21T23:02:51Z

I did more tests but still can't reproduce the issue. Please provide more info about your setup:

What is the browser you're using
What happen when the request is sent via curl or wget or postman
What show up if you access some other endpoints, for example /api/models

Also from your log, request: GET /favicon.ico 400, this means even non-API paths returns 400 error which is wrong. It's further strengthen my suspicion about bug in httplib.

tim-janik · 2025-01-22T00:37:52Z

I did more tests but still can't reproduce the issue. Please provide more info about your setup:
* What is the browser you're using

* What happen when the request is sent via `curl` or `wget` or postman

* What show up if you access some other endpoints, for example `/api/models`
Also from your log, request: GET /favicon.ico 400, this means even non-API paths returns 400 error which is wrong. It's further strengthen my suspicion about bug in httplib.

Thanks for the hint. I reverted f30f099 and only applied the httplib 0.18.5 change.
Turns out I can only trigger this issue if the URL contains a %0A, like this:

http://localhost:8080/?something=%0A

Which explains why only I saw it, I probably still had some q= or m= arg left over from playing with #11150 (btw, I'd appreciate answers to the two questions I posed in that PR if you could afford the time).

Which means this httplib change breaks multi-line use cases of #11150, do you have any idea why the newer httplib would break with URL encoded newlines?

ngxson · 2025-01-22T08:40:09Z

Ok sorry I forgot about #11150, I'll have a look later because it's not the priority.

I'm tagging author of httplib @yhirose too, maybe you have an idea why it returns 400 even for non-handler endpoints? (For example, /favicon.ico that should have returned 404 instead of 400)

ngxson · 2025-01-23T23:16:53Z

OK I think I pinpointed the problem. It only happens on OPTIONS request, which causes the memory to be corrupted somewhere. I repeatedly send OPTIONS request and got this in the log:

request:    400
request:    400
request: OPTIONS /tokenize 127.0.0.1 200
request: ndled   400
request: root   400
request: root   400

ngxson · 2025-01-24T08:30:51Z

Related to yhirose/cpp-httplib#2028

tim-janik added the bug-unconfirmed label Jan 21, 2025

ngxson self-assigned this Jan 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: llama-server stopped working after PR #11285 got merged #11335

Eval bug: llama-server stopped working after PR #11285 got merged #11335

tim-janik commented Jan 21, 2025

ngxson commented Jan 21, 2025 •

edited

Loading

ngxson commented Jan 21, 2025 •

edited

Loading

ngxson commented Jan 21, 2025

tim-janik commented Jan 22, 2025

ngxson commented Jan 22, 2025

ngxson commented Jan 23, 2025

ngxson commented Jan 24, 2025 •

edited

Loading

Eval bug: llama-server stopped working after PR #11285 got merged #11335

Eval bug: llama-server stopped working after PR #11285 got merged #11335

Comments

tim-janik commented Jan 21, 2025

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

ngxson commented Jan 21, 2025 • edited Loading

ngxson commented Jan 21, 2025 • edited Loading

ngxson commented Jan 21, 2025

tim-janik commented Jan 22, 2025

ngxson commented Jan 22, 2025

ngxson commented Jan 23, 2025

ngxson commented Jan 24, 2025 • edited Loading

ngxson commented Jan 21, 2025 •

edited

Loading

ngxson commented Jan 21, 2025 •

edited

Loading

ngxson commented Jan 24, 2025 •

edited

Loading