-
-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
socketpair() bugfix leaves ability for attacker to infinitely block the function under some limited circumstances #122797
Comments
We should not have any form of busy loop. Localhost processes with I suggest creating a Windows proof of concept. Ie: Does its TCP stack even hang in an |
If that is correct and As an additional note, I feel like maybe it might be a good idea to just use a non-blocking or timeout approach anyway just to be safe. It feels like a more defensive approach here could be useful, e.g. if this code ever is used for other operating systems where maybe the behavior is different. (Sorry, I had briefly duplicated this comment by accident. I apologize for the notification spam) |
For what it's worth, I personally think this statement isn't necessarily true for some reasons, e.g. related to Windows containers:
But I don't want to waste everyone's time going into that if the hang I pointed out isn't actually possible. |
I'm having trouble coming up with authoritative references on how the Windows TCP stack behaves, but in general IIUC the use of a OS-wide "SYN queue" or the use of SYN cookies should be common for partial "ESTABLISHING" state connections. The size of that queue (if one exists) is not related to a specific |
I'm out of my depth here (wouldn't even know what to look for in the Windows sources, though I can), but I don't really see how looping over a short wait is that much worse than a long wait? In most cases we'd expect a successful loopback connection to be near instantaneous, so even a short wait should return and we break out. But then, assuming you are under active attack, having a quick break out is potentially worse than a slow one, as you're far more likely to detect the attack. There does seem to be a (configurable) limit to the number of half open connections allowed, which to me says we should ignore the possibility ourselves and let the OS deal with it - a setting like this more clearly indicates the user's intent than anything we would do automatically, and if we work around their limit then we're just working against their intent. Best to play dumb and let things work the way they set it up. |
As for the impact: a long wait seems worse to me since a quick or instant failure should be an expected code path for the caller, unlike long wait. Since people don't seem to be fully sure about what's guaranteed here, I personally still think doing a short loop here would be preferable. But I have nothing more of use to add. |
To reduce clutter with non-issues, I'm closing this since the consensus seems to be that this isn't an issue (not that I personally know enough about whether this is true) and there hasn't been anything new coming up related to it. I hope that's helpful. |
Bug report
Bug description:
(This issue concerns #122133 78df104 )
I reported the initial issue of
socketpair()
and it was suggested to me to bring up here that the fix seems incomplete to me. It seems to me an attacker could still try to intercept the socketpair creation and cause a hang in this line:cpython/Lib/socket.py
Line 647 in 78df104
connect()
)I think this is a problem despite the normal expectation that most socket functions can almost always block or hang, because a user wouldn't expect
socketpair()
specifically to hang but rather either work or almost instantly fail since it's a local operation.There are probably some caveats why this is a minor issue:
On most systems there is a default timeout of 20 seconds or so, this should still apply here. But on systems where there isn't, this would possibly hang forever. Even if there is a timeout, 20 seconds will likely also too long for what most applications would usually expect this function to take.
It might require root or
CAP_NET_RAW
on most systems to exploit. Nevertheless, this could be the case for some network diagnostic tooling the user runs e.g. in flatpak that was meant to still have sandboxing and isolation that might be impacted by this.Unlike the original problem it would therefore require local access, since I assume the JavaScript access to localhost sockets from random websites usually wouldn't have
CAP_NET_RAW
.A possible fix would be to either set non-blocking on the server socket as well and do the
accept()
in a small short busy loop before failing almost instantly, or to set a short socket timeout (probably under 0.5 seconds) if that one is platform independent enough to do the job. My own code suggestion I sent some weeks back has a timeout longer than 1 second but that's probably too long given what the average application would be reasonably expecting.CPython versions tested on:
CPython main branch
Operating systems tested on:
Windows
The text was updated successfully, but these errors were encountered: