Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Latest ack_nr must not be bigger that the pending packets in the current receive window #138

Open
ogenev opened this issue Jan 30, 2025 · 13 comments · May be fixed by #147
Open

Latest ack_nr must not be bigger that the pending packets in the current receive window #138

ogenev opened this issue Jan 30, 2025 · 13 comments · May be fixed by #147

Comments

@ogenev
Copy link
Member

ogenev commented Jan 30, 2025

Running multiple uTP streams can cause sometimes the receiver to have pending packets with low sequence numbers than the current ack_nr.

This can't be possible because the latest ack_nr means that we received ALL packets before this number, so the pending packets should not include numbers <= last ack_nr (except if we overflow on the pending packets seq).

@ogenev ogenev changed the title Latest ack_nr must be bigger that the pending packets in the current receive window Latest ack_nr must not be bigger that the pending packets in the current receive window Jan 30, 2025
@KolbyML
Copy link
Member

KolbyML commented Jan 30, 2025

I don't think this is an error, but a feature. latest ack_nr means that we received ALL packets before this number This isn't true, Overflows are used to form sequentially order packets so just packet seq_num 0 can be newer then packet 2^16.

seq_numbers are infinitely reused during the connection, hence so the pending packets should not include numbers <= last ack_nr isn't true unless you are accounting for 0 being greater then 65535

@ogenev
Copy link
Member Author

ogenev commented Jan 30, 2025

yes, if you randomly generate a 2^16 packet number and then you overflow, then ok, it can happen, but the problem is that this bug is happening right now without overflowing seq numbers.

@ogenev
Copy link
Member Author

ogenev commented Jan 30, 2025

Edited the issue to note that this is happening even if we don't overflow on the seq numbers, which means there is a bug in the utp receiver logic and how the last ack_nr/pending packets are tracked.

@KolbyML
Copy link
Member

KolbyML commented Jan 30, 2025

Edited the issue to note that this is happening even if we don't overflow on the seq numbers, which means there is a bug in the utp receiver logic and how the last ack_nr/pending packets are tracked.

Could you post some logs, what client is the sender?

@ogenev
Copy link
Member Author

ogenev commented Jan 30, 2025

Here are some logs:

Image

Image

@KolbyML
Copy link
Member

KolbyML commented Jan 30, 2025

I do see the issue you are pointing out now. I don't agree with what is described in the issue though the receiver to have pending packets with low sequence numbers than the current ack_nr. as this isn't a problem, as you send send over 2^16 packets, as the seq_nums used keep rolling over. I do agree it is concerning that 2^16-8 packets are sent in flight at the same time though.

I wonder what uTP implementation is sending us that. The logs don't say anything, but the sender must be sending super small packets. Cause it is clear that there is a bug, but not necessarily enough to go on from the logs alone, but I assume you are debugging this now.

Very cool you found this problem!

@ogenev
Copy link
Member Author

ogenev commented Jan 30, 2025

This is the sender details:

Enr { id: Some("v4"), seq: 1738007246, NodeId: 0xba1647ae9a93d575a3c1d700774dc4d6e71fe2b85f2eb90b286d62b6602ed5cf, signature: "94c9deb7c244ee57347e49fb7b35b4c99068cda5fe4f8f0174c680508545d0ec1286f05baeb8614d49b56e777ff74e79db784a92dde830bad90fe0498b4d0ff0", IpV4 UDP Socket: Some(178.128.246.120:9009), IpV6 UDP Socket: None, IpV4 TCP Socket: None, IpV6 TCP Socket: None, Other Pairs: [("c", "89742036363730323365"), ("secp256k1", "a102c0655256cf8d4ad33c4187284da60b26e8f6e50d5431134977d0c042de752986")], .. }

@KolbyML
Copy link
Member

KolbyML commented Jan 30, 2025

This is the sender details:

Enr { id: Some("v4"), seq: 1738007246, NodeId: 0xba1647ae9a93d575a3c1d700774dc4d6e71fe2b85f2eb90b286d62b6602ed5cf, signature: "94c9deb7c244ee57347e49fb7b35b4c99068cda5fe4f8f0174c680508545d0ec1286f05baeb8614d49b56e777ff74e79db784a92dde830bad90fe0498b4d0ff0", IpV4 UDP Socket: Some(178.128.246.120:9009), IpV6 UDP Socket: None, IpV4 TCP Socket: None, IpV6 TCP Socket: None, Other Pairs: [("c", "89742036363730323365"), ("secp256k1", "a102c0655256cf8d4ad33c4187284da60b26e8f6e50d5431134977d0c042de752986")], .. }

Trin! Well that is very concerning, that means we are sending packets with payloads on average 1024*1024/65535= 16bytes. That is very yikers as the header is 64

@ogenev
Copy link
Member Author

ogenev commented Jan 30, 2025

This is the sender details:

Enr { id: Some("v4"), seq: 1738007246, NodeId: 0xba1647ae9a93d575a3c1d700774dc4d6e71fe2b85f2eb90b286d62b6602ed5cf, signature: "94c9deb7c244ee57347e49fb7b35b4c99068cda5fe4f8f0174c680508545d0ec1286f05baeb8614d49b56e777ff74e79db784a92dde830bad90fe0498b4d0ff0", IpV4 UDP Socket: Some(178.128.246.120:9009), IpV6 UDP Socket: None, IpV4 TCP Socket: None, IpV6 TCP Socket: None, Other Pairs: [("c", "89742036363730323365"), ("secp256k1", "a102c0655256cf8d4ad33c4187284da60b26e8f6e50d5431134977d0c042de752986")], .. }

Trin! Well that is very concerning, that means we are sending packets with payloads on average 1024*1024/65535= 16bytes. That is very yikers

I think it is more likely that there is a bug in the uTP socket and how it handles pending and ack packets.

@KolbyML
Copy link
Member

KolbyML commented Jan 30, 2025

This is the sender details:

Enr { id: Some("v4"), seq: 1738007246, NodeId: 0xba1647ae9a93d575a3c1d700774dc4d6e71fe2b85f2eb90b286d62b6602ed5cf, signature: "94c9deb7c244ee57347e49fb7b35b4c99068cda5fe4f8f0174c680508545d0ec1286f05baeb8614d49b56e777ff74e79db784a92dde830bad90fe0498b4d0ff0", IpV4 UDP Socket: Some(178.128.246.120:9009), IpV6 UDP Socket: None, IpV4 TCP Socket: None, IpV6 TCP Socket: None, Other Pairs: [("c", "89742036363730323365"), ("secp256k1", "a102c0655256cf8d4ad33c4187284da60b26e8f6e50d5431134977d0c042de752986")], .. }

Trin! Well that is very concerning, that means we are sending packets with payloads on average 1024*1024/65535= 16bytes. That is very yikers

I think it is more likely that there is a bug in the uTP socket and how it handles pending and ack packets.

I am aware of a bug making us not able to handle over 2^16 data packets, I am assuming this bug might be different, but hopefully we find and fix them all 🧐

@carver
Copy link
Contributor

carver commented Feb 12, 2025

I am aware of a bug making us not able to handle over 2^16 data packets, I am assuming this bug might be different, but hopefully we find and fix them all 🧐

Is this the error line you're thinking of for large data?

2025-02-12T06:38:27.948456Z  WARN uTP{send=107 recv=106}: utp_rs::conn: resetting connection: received ACK for unsent packet err=InvalidAckNum
2025-02-12T06:38:27.980575Z  WARN uTP{send=107 recv=106}: utp_rs::conn: ack does not correspond to known seq_num err=received ACK for unsent packet packet=packet cid=106 packetType=ST_STATE seqNr=59869 ackNr=12521 timestamp=810909440 timestampDiff=26374 remoteWindow=1018816

FWIW, #147 does not resolve this ^ error. (as of 678f7bf anyway)

@KolbyML
Copy link
Member

KolbyML commented Feb 12, 2025

I am aware of a bug making us not able to handle over 2^16 data packets, I am assuming this bug might be different, but hopefully we find and fix them all 🧐

Is this the error line you're thinking of for large data?

2025-02-12T06:38:27.948456Z  WARN uTP{send=107 recv=106}: utp_rs::conn: resetting connection: received ACK for unsent packet err=InvalidAckNum
2025-02-12T06:38:27.980575Z  WARN uTP{send=107 recv=106}: utp_rs::conn: ack does not correspond to known seq_num err=received ACK for unsent packet packet=packet cid=106 packetType=ST_STATE seqNr=59869 ackNr=12521 timestamp=810909440 timestampDiff=26374 remoteWindow=1018816

FWIW, #147 does not resolve this ^ error. (as of 678f7bf anyway)

I am a little confused what you are asking, but to fix the bug of not being able to send over 2^16 packets, instead of tracking the initial sequence number in sent.rs and recv.rs we would need to instead track the current_packet_window and next_sequence_number, also we would need to remove the concept in sent.rs where we never delete any of the packets, instead we would keep a circular buffer of packets in the current_packet_window or in other words.

packets should be a circular buffer containing (next_sequence_number.wrapping_sub(current_packet_window)..next_sequence_number) excluding acked packets.

I am a little confused what you were asking, but I thought I would share this context regardless.

@carver
Copy link
Contributor

carver commented Feb 12, 2025

Ok, so it sounds like it's a distinct issue. I just posted the test I used to get that error: #148

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants