You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I've run into 2 nodes in my swarm ending up in a state where the containers running on the node were labelled status 'dead', which based on my configuration in jupyterhub means they should have been removed. However the removal of these containers failed with a 'resource busy' error (similar to this moby/moby#31195).
Under this condition when the user attempts to reconnect, instead of launching a new container under a new name it finds the old one and attempts to relaunch it. Docker will not allow this because the container is flagged for removal. So instead the end user gets a 500 error and can't interact with the hub interface at all.
Would it be possible to add some checks for this so either errors make more sense or better a new container gets launched with an _ instead so the user can continue to work while the backend issue is resolved?
edit: This was resolved for the impacted users on the backend by sshing into the swarm nodes and issuing a 'docker rm -f' for each dead container. After that users could get new containers created again. If we could make the remove portion of dockerspawner do a forced removal that may resolve the issue on it's own.
The text was updated successfully, but these errors were encountered:
Adding some extra error handling would indeed be great. If you have a traceback, we can make the changes in the right place. A PR would be very welcome!
I've run into 2 nodes in my swarm ending up in a state where the containers running on the node were labelled status 'dead', which based on my configuration in jupyterhub means they should have been removed. However the removal of these containers failed with a 'resource busy' error (similar to this moby/moby#31195).
Under this condition when the user attempts to reconnect, instead of launching a new container under a new name it finds the old one and attempts to relaunch it. Docker will not allow this because the container is flagged for removal. So instead the end user gets a 500 error and can't interact with the hub interface at all.
Would it be possible to add some checks for this so either errors make more sense or better a new container gets launched with an _ instead so the user can continue to work while the backend issue is resolved?
edit: This was resolved for the impacted users on the backend by sshing into the swarm nodes and issuing a 'docker rm -f' for each dead container. After that users could get new containers created again. If we could make the remove portion of dockerspawner do a forced removal that may resolve the issue on it's own.
The text was updated successfully, but these errors were encountered: