-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HTTP2 connection not being reset when error received #1401
Comments
This is a message the load balancer sends back to the client. It can be handled in the client via killing the entire session and reopening from scratch, but we don't have control over that with ConnectRPC. This is something we can revisit when we write our own session manager, but in the meantime @davidmytton can look into the load balancer configuration. |
The load balancer isn't under our control - it's from Fly.io. I've opened a support case with them. |
Fly have now confirmed their proxy may return |
We have had the high tail latency (network jitter) support case open with Fly.io for over a week. The tail latency causes internal slowdowns in our API which results in a timeout on the client, and I think goes back to the Fly proxy as a reset. Given enough timeouts / resets (1024 by default), this causes the Fly proxy to trigger This is what hyperium/h2 describes at: https://github.com/hyperium/h2/blob/90359ba6d38843b106967a6ac9419a500ea26873/src/server.rs#L892 However, that error isn't being handled properly by the client - it continues to send requests on the same connection. What we see is the Arcjet SDK client connects to our API endpoint on Fly, Fly proxy proxies it our app, the API processes the request normally and returns the response, the client gets Some additional notes:
Stream tracking
|
A user is receiving various HTTP2 errors e.g.
{"error":{"message":"[internal] Stream closed with error code NGHTTP2_ENHANCE_YOUR_CALM"}}
and{"error":{"message":"[internal] Stream closed with error code NGHTTP2_INTERNAL_ERROR"}}
for every call to decide, which is then being logged via report.I'm unsure about the cause of this, but it may be a timeout or error on a previous request. This is referenced in various places, but all suggest fixes have already gone out:
The only way to resolve this is to restart the Node process, which isn't ideal. When we get these errors, can we re-establish the connection?
The text was updated successfully, but these errors were encountered: