-
-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UDP hole punch and forwarded UDP ports not working on headless (LNL/LiteNetLib) #3309
Comments
How many combinations of connections have you tried with this? UDP Punchthrough isn't something that's guaranteed to work in 100 % of the time, because it heavily depends on the network route between the two ends. This means that that not only the host connection matters, but the connecting user as well and potentially other network elements (e.g. if you're behind multiple levels of NAT). I've seen UDP Punchthrough work on everyday basis, I've even tested it with some headless servers right now and it works, so I don't think this is something that's completely broken. There shouldn't really be any difference between headless and client anyways, because they're using the exact same code. |
From experiments done last... well, moment: A UDP server behind certain types of NAT can result in the hole punch for that port timing out if someone stops sending traffic. This is something that's happening to me during tests w/ Rucio, and we both have the "good" NAT (port numbers map directly and it doesn't screw people around) Although this doesn't explain why an explicitly-forwarded port would break. That'll need further investigation. What does LNL do to alleviate this situation? |
That sounds like there's some firewall or something that might be blocking it. Neos will send regular beacon packets to keep the connection ready for punchthrough. Can you host a headless session with a custom session ID and let me know what it is? I can check what's happening on the server-side if I know what to look for. It might be just UDP not working with your particular combination of connections. Unfortunately that happens, networks are very messy. I'd probably check firewalls and make sure your ISP isn't filtering things either. |
OK, that's good to know. A little lynx told me that there's a reply to this thread coming soon... |
Thanks for the response Froox. I did more testing and determined that the static port NAT on the server was not configured appropriately (Thanks to iamgreaser for running some UDP tests with me). It seems as though the UDP hole punch is only successful if either the host/server or both parties have static port NAT. After finally achieving LNL Relay-free connections, all of the connection problems I've been having vanished. The packet captures for when the LNL Relay was showing a constant loss of packets at a regular cadence. I'm glad to have found a configuration that allows me to not have to use the relay, but I worry that there's something causing trouble at the relay. I think it's best to archive this issue, but I do think there's some details that would be helpful for others to know, including customers/users operating behind a random port (Type-3 NAT) and that port forwards do not appear to be used by Neos. I'm going to keep doing some testing, maybe there's things I can help to add to the documentation about properly configuring enterprise routers and routers like pfSense and OPNsense which by default will randomize the source port unless otherwise configured. Regarding the relay, have you heard any reports of users experiencing packet drops/connection issues when utilizing it? I'd be happy to flip my config back and do some more testing or grab some packets if you'd like. PS: The headless is RucioLess and it's running an open world named "LNL Testing". Session ID is S-5ac3d8aa-e4a8-487d-8cf4-cfb19534faae |
Thanks for the info, I'm glad you got it to work! And yeah, unfortunately if even one of the sides randomizes the ports, the punchthrough isn't likely going to work. It requires that the port stays the same between the two requests. The relay is there specifically for cases like this, when due to network conditions/configuration the process just fails. We currently have only a single Relay in Central US, so it's likely that you might be tacking on a significant amount of latency, which is going to make this behave worse. We'll probably add more as time goes, but it's an additional cost on our end. I'm going to close this for now, thanks! If you'd like to share this info with others, I'd recommend making article on our Wiki! |
As a follow up, I expanded the existing article on the Neos Wiki to help users configure their routers appropriately for peer to peer communication - https://wiki.neos.com/Networking_Information Thanks! |
Describe the bug?
Headless servers and clients are unable to establish a peer to peer connection when the headless is behind a NAT. The only peer to peer connection I have observed is when both the client and server are not behind a NAT.
According to the Neos networking documentation, a peer to peer connection should be possible by being on the same network or by means of establishing a UDP hole punch. However, clients do not establish a peer to peer connection to the headless server located behind a Type-2 (static port NAT). Additionally, clients to not attempt to connect to the host port of the server over the public IP address. Due to this, no client is able to successfully establish a peer to peer connection to the headless which results in all traffic going through a relay server.
By resolving this bug, Neos could save money by means of reduced bandwidth costs and Neos users may experience better performance and less desync issues by connecting directly.
Note: This issue may apply to clients hosting worlds as well, but only tests I've conducted thus far have been with a headless server as host.
Relevant issues
No relevant issue were found, however with the recent changes to the network stack it is possible that this issue is the root of some desync, disconnect and other issues that users have reported. However, I do not have any packet captures or evidence that this has ever worked in prior builds.
To Reproduce
You will observe that no attempts are made by the client to establish a connection to the public IP address of the server over the outbound port or the inbound forwarded ports.
Expected behavior
Expected behavior is that the UDP hole punch is successful OR the connection is established via the forwarded server ports.
Log Files
None of the headless logs appear to contain relevant data pertaining to the LiteNetLib. If there is a debug command I can apply to the server, I'd be happy to test again and save any relevant logs.
Screenshots
The above screenshot shows the entire LiteNetLib connection process. Note the IP address have been redacted for security purposes.
Key -
.6 = Private IP address of the server
[Blank] = Private IP address of the client
.116 = Public IP address of the client
.220 = LNL Relay
I'm unfamiliar with the full process that LiteNetLib uses to perform the UDP hole punch, but packets do not show evidence of any attempt to utilize the static port NAT of the server or the client establishing a direct connection to the server via the forwarded IP.
How often does it happen?
Always
Does the bug persist after restarting Neos?
Yes
Neos Version Number
2021.10.30.605
What Platforms does this occur on?
Linux
Link to Reproduction Item/World
No response
Did this work before?
I Don't Know
If it worked before, on which build?
No response
Additional context
I'd be happy to test further or provide a copy of the packet captures to the devs upon request.
Reporters
Rucio#0134
The text was updated successfully, but these errors were encountered: