Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Endless loop on GOAWAY #179

Open
tmolitor-stud-tu opened this issue Jan 8, 2021 · 6 comments
Open

Endless loop on GOAWAY #179

tmolitor-stud-tu opened this issue Jan 8, 2021 · 6 comments
Labels

Comments

@tmolitor-stud-tu
Copy link

I'm using lua-http to talk to Apples's push servers.
My code can be found here: https://github.com/tmolitor-stud-tu/mod_push_appserver/blob/master/mod_push_appserver_apns/mod_push_appserver_apns.lua#L98

When the HTTP/2 connection to Apple is idle for a long time (> ~15 minutes), sending a new request (R1) makes Apple to respond with a GOAWAY frame having the id of this new request.
This creates an endless cqueues loop with cqueues returning new timeouts of 0.0 seconds in an endless loop.
Only testing connection.recv_goaway_lowest when attempting the next (R2) push and closing the connection if this is set, will stop the endless loop. This could be several minutes after (R1).
Only at this point does the call to stream:get_headers(1) for (R1) return with an error read: Transport endpoint is not connected and the loop stops.

Sidenote: I had to change the cqueues integration of prosody to not reshedule an immediate cqueues run on zero second timers without running the prosody mainloop at least once, to see this happen. Using the vanilla prosody integration will result in an unresponsive server spinning forever.

Expected behaviour: Receiving a GOAWAY frame should make running requests waiting in stream:get_headers(), stream:get_body_as_string() etc. having a bigger h2 stream id than the GOAWAY return an error instead of spinning forever.

@daurnimator
Copy link
Owner

Thanks, this indeed sounds like a bug.
I don't suppose you could up with a self-contained test case?

@tmolitor-stud-tu
Copy link
Author

Well, the whole thing would need a h2 server behaving oike apple's push endpoint.
Can a lua-http server programmatically send out goaway frames?
A testcase would be such a server and a lua-http client sending some requsts, waiting for several minutes, then sending a new request which will be answered by the server with a goaway (but without closing the tcp connection, I guess).

Other idea: you tell me what debug output to put where and I'll add this to our push server for Monal (we are using a test server for the alpha builds of Monal which is ideal for such tests).

Other observation: This bug seems to be in all versions (0.1 to 0.3) of lua-http.

@daurnimator
Copy link
Owner

Well, the whole thing would need a h2 server behaving oike apple's push endpoint.
Can a lua-http server programmatically send out goaway frames?

yes. The test suite has several such cases. see e.g. https://github.com/daurnimator/lua-http/blob/master/spec/h2_stream_spec.lua

A testcase would be such a server and a lua-http client sending some requsts, waiting for several minutes, then sending a new request which will be answered by the server with a goaway (but without closing the tcp connection, I guess).

no need to wait the 15 minutes. It sounds like you could:

  1. start a new stream from client
  2. send a goaway from server
  3. assert that client doesn't get stuck in loop. maybe starting a new stream should throw an error/return nil, err?

@tmolitor-stud-tu
Copy link
Author

  1. assert that client doesn't get stuck in loop. maybe starting a new stream should throw an error/return nil, err?

Isn't that already in the code (I think I found something like this already when I read the code)

tmolitor-stud-tu added a commit to tmolitor-stud-tu/mod_push_appserver that referenced this issue Jan 12, 2021
This uses the lua-http connection.recv_goaway condition variable to
close the connection upon receiving a GOAWAY frame.
@jprjr
Copy link

jprjr commented Mar 25, 2022

Hi there, I wrote a small standalone demo to reproduce the issue. The client connection will spin forever when trying to receive headers from a server that's sent a GOAWAY frame.

I'm not 100% sure if this is what Apple's server does as mentioned in the notes, but I think it is? I see similar behavior when connecting to nginx - if I connect, then sleep past the default nginx timeout, I can open a stream, write headers, but then loop forever when trying to read headers.

local h2_connection = require "http.h2_connection"
local h2_error = require "http.h2_error"
local new_headers = require "http.headers".new
local cqueues = require'cqueues'
local ca = require "cqueues.auxlib"
local cs = require "cqueues.socket"
local cc = require "cqueues.condition"

local cq = cqueues.new()
local cond = cc.new()

local s, c = ca.assert(cs.pair())
s = assert(h2_connection.new(s,"server"))
c = assert(h2_connection.new(c,"client"))

local req_headers = new_headers()
req_headers:append(":method", "GET")
req_headers:append(":scheme", "http")
req_headers:append(":path", "/")

cq:wrap(function()
  assert(cond:wait())
  local stream = c:new_stream()
  assert(stream:write_headers(req_headers,true))
  -- this will block foerver
  local res_headers = stream:get_headers(10)
end)

cq:wrap(function()
  -- just to ensure other coroutine waits first
  cqueues.sleep(0.1)
  assert(s:write_goaway_frame(nil,h2_error.errors.INTERNAL_ERROR.code))
  cond:signal(1)
end)

assert(cq:loop())

@jprjr
Copy link

jprjr commented Mar 25, 2022

Hi there, sorry for the double-update - I wasn't 100% sure if that standalone demo was violating any specs or anything (I'm not super familiar with the details for http2), so I also re-created the case using nginx. I figure if I can recreate it using a web server, I'm probably not violating the server-side protocols/specs.

So, here's a minimal nginx config, it has a 1s timeout to receive client headers:

daemon off;
worker_processes 1;

events {
    worker_connections 128;
}

pid nginx.pid;

error_log stderr debug;

http {
    access_log off;

    client_header_timeout 1s;
    http2_recv_timeout 1s; # deprecated, nginx >= 1.19.7 uses client_header_timeout

    server {
        listen 127.0.0.1:8080 http2;
        return 404;
    }
}

and an example program that connects, sleeps for 2 seconds, then tries to perform the request.

local cqueues = require'cqueues'
local http_client = require'http.client'
local http_headers = require'http.headers'

local headers = http_headers.new()
headers:append(':method','GET')
headers:append(':path','/')
headers:append(':scheme','http')

local connection = assert(http_client.connect({
  host = '127.0.0.1',
  port = 8080,
  version = 2,
}))

cqueues.sleep(2)

local stream = assert(connection:new_stream())
assert(stream:write_headers(headers,true))
local res_headers = assert(stream:get_headers(10));
for k,v in res_headers:each() do
  print(k,v)
end

While the client is sleeping, nginx sends a goaway frame. write_headers reports a success, then get_headers loops forever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants