Intermittent operation: MySQL server has gone away, HTTP 500 and HTTP 502 errors with API calls #1605
-
How Shlink is set-up
Current behaviorI have deployed shlink however I get extremely unstable results. Using the shlink-web-client docker image I get random 500 and 502 errors with all pages with random AJAX calls failing per pageload. Expected behaviorNo errors and smooth operation. How to reproduceJust a standard test deployment on docker, no redis. |
Beta Was this translation helpful? Give feedback.
Replies: 8 comments 1 reply
-
Can you try with docker Could you also check the logs (the container's output), and share some of the errors you mention? Could you also check what is the response reaching the browser when those 500 and 502 errors happen and share some examples? |
Beta Was this translation helpful? Give feedback.
-
At first I did try with stable which had the same issues. I upgraded to latest and erased everything to see if that solved it but it didn't. This is the access log of my reverse proxy when accessing the overview page of shlink-web-client, which then calls the urls on the shlink server instance. This is downstream status code + URL path:
But, in the shlink logs there's nothing being reported for these 502 responses. Instead there's the rest health checks which go through perfectly ok:
The browser response for the 502 error's are the content |
Beta Was this translation helpful? Give feedback.
-
It is worth noting that when I load in the browser directly the URL path |
Beta Was this translation helpful? Give feedback.
-
I made some configuration changes, now I'm back to these 500 errors:
And on the shlink log side: The next reload it worked, then the reload after that it didn't, etc etc. |
Beta Was this translation helpful? Give feedback.
-
The first set of 502 errors was definitely a misconfiguration on the reverse proxy. These new errors seem to be a misconfiguration on the database connection. How do you provide the database config env vars? (You can redact the credentials) |
Beta Was this translation helpful? Give feedback.
-
With the 502 errors it seems that it was caused by I supply the DB config as My suspicion without checking this at all is that the workers connections to the mysql DB are idling too long, especially since this is a test deployment it gets no activity. So when I leave it for a while and then load something up the errors show up and it's 500s until all workers have cycled and made new DB connections. Interestingly, when I setup a simple bash loop to load one of my short URLs without any pause and then while that runs I check pages around shlink-web-client, there are no 500 errors reported. When I stop the script and wait some minutes, the issue shows up again in the web dashboard and the 500 errors are back. Since it's intermittent like almost 50/50 or even 30/70 that it works, it would indicate that each reload hits a different worker. And, if the workers didn't have a consistent connection across all of them then the error would be shown like this. I have not looked deeper into how possible this is, but the behaviour seems to back up the theory at least. |
Beta Was this translation helpful? Give feedback.
-
I've the same intermittent error 502... |
Beta Was this translation helpful? Give feedback.
-
For anyone ending-up here, try using the It's a drop-in replacement. Just append See #1669 (comment) for details. |
Beta Was this translation helpful? Give feedback.
For anyone ending-up here, try using the
roadrunner
variation of Shlink docker image, as it seems to not be affected by this issue.It's a drop-in replacement. Just append
-roadrunner
to the version tag. For example:shlinkio/shlink:3.5.3-roadrunner
.See #1669 (comment) for details.