-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hang and crash when disconnecting on Windows #314
Comments
Hey there, any news? Because first the commit that i replaced the t_socket was a attenpt to clear a warning also reverting one of them maybe in fact should be the best choice. |
Hi.
I'm using
libsmb2
on a Windows 11 laptop to connect to a Samba drive on a Linux embedded device. The connection happens through a USB3 cable, where the embedded device is detected as a network card on the laptop using RNDIS.It has been working well so far, with the exception that if I unplug the USB cable during a transfer, then I have a crash when I delete the context later with
smb2_destroy_context
.Now I tried to update the lib to the latest commit on
master
, as my prior version oflibsmb2
was from June 2022.Hang forever
But now with the latest new version, I get an infinite hang if I unplug the cable during a read operation using the sync API:
wait_for_reply
function.fd
is never invalidated to -1 and the function does not return.smb2_service_fd
, the polling raises flagsPOLLERR
andPOLLHUP
. ThePOLLHUP
block is never reached due to thegoto
inPOLLERR
block. However, in both cases I don't see anything that could invalidate thefd
there.Fix
EDIT: After checking the diff since June 2022, I see that
smb2_service
andsmb2_service_fd
now return at_socket
type instead of anint
. But on Windows, this type is defined as unsigned, so returning-1
and checking for< 0
does not work, causing the function to not return and thus the infinite hang.The faulty commit seems to be
4b1ef1f6bd09f96a9f0b78accfeac6e2c2766f34
.Crash
EDIT2: With the latest version and the hang fixed, if I unplug the USB cable while it's transferring a file, and then I close the SMB client and call
smb2_destroy_context
, then I get something looking like memory corruption, causing fully random crashes in the rest of the application. Commenting the linesmb2_destroy_context
in the destructor of my client fixes these crashes.Any idea what could go wrong in that case ? This seems very difficult to find and fix (on Windows, without valgrind); I'll try to replace the
calloc
andfree
functions and put prints, but the nature of the code with multiple callbacks and genericvoid*
types makes it extremely difficult to debug.Fix
EDIT3: After two full days of analysis, the commit
0dc1cc795512c808b52ecc83f7cd3e530936655d
which creates a special case for the close operation is the cause of my crash.In my workflow, I read a file and then close it using
smb2_close()
(sync). However, if the USB network interface is removed during the transfer, the read operation fails, but thesmb2_close()
is still called to simplify the logic, expecting it to fail as well. In thissmb2_close()
, thewait_for_reply()
call fails with code -1 due to an expected error insmb2_service()
, also returning -1. However in that case, the callback remains set somewhere in a PDU that was queued. So calling thefree()
at the end ofsmb2_close()
frees this memory that is still referenced in a PDU callback.Then during the call to
smb2_destroy_context
when destroying the client, the PDUs in the queues are cancelled and all callback are called. But as the one for the close operation was already freed, it crashes !What could be done to fix these issues? Reverting these 2 commits is maybe not the best fix, as one of them was added to fix a memory leak apparently.
In my case I'll fix locally already, because a leak is better than a crash.
The text was updated successfully, but these errors were encountered: