Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using the 6.0-lts of varnish causes notificaitons push to disconnect long polling connections every 10 minutes. #119

Open
ManoelMilchev opened this issue May 17, 2023 · 0 comments

Comments

@ManoelMilchev
Copy link
Contributor

Context:

Around the cyber security reports we had to update the base images of our docker images. We have decided that its a good idea to also move to the 6.0-lts version of varnish from 6.2.1.

The problem:
For most of the clusters problems didn't occur. The delivery clusters however have a long polling connections which should always stay open notifications-push, the varnish started disconnecting the clients exactly every 10 minutes.

After some research we have suspected that one of the default parameters send_timeout of varnish is causing this behaviour as it is probably working differently in the different versions.

Default values of the send_timeout parameter:

send_timeout
Units: seconds

Default: 600.000

Minimum: 0.000

Flags: delayed

Total timeout for ordinary HTTP1 responses. Does not apply to some internally generated errors and pipe mode.

When 'idle_send_timeout' is hit while sending an HTTP1 response, the timeout is extended unless the total time already taken for sending the response in its entirety exceeds this many seconds.

When this timeout is hit, the session is closed

But those were only speculations so we had to test this.

The test:

We have deployed the delivery-varnish with varnish 6.0-lts on dev:
As expected the connections were again failing every 600 seconds (10 minutes) and we got the expected logs from PAM.

I have manually changed the parameter value to 7200 seconds (2 hours) in both pods of the varnish. As our suspicion predicted the connections were closed exactly after two hours- logs from PAM.

Future steps:
If update of the version of the varnish is necessary in the near future we can test this behaviour extensively for a longer period on dev and staging (for example 24 hours, or something like that). If everything goes well, we should be fine to proceed with this setup to prod.

Another thing we could do is try to find new suitable version of the varnish plugins which restrict us to work with older ubuntu base images.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant