Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPv6 defragmenting fails when segments do not overlap #54577

Closed
ssharks opened this issue Feb 7, 2023 · 27 comments
Closed

IPv6 defragmenting fails when segments do not overlap #54577

ssharks opened this issue Feb 7, 2023 · 27 comments
Assignees
Labels
area: Networking bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug

Comments

@ssharks
Copy link
Collaborator

ssharks commented Feb 7, 2023

Describe the bug
With the Maxwell Pro testing some IPv6 fragmentation tests are failing. Most of the tests pass without problems, but the tests where fragments do not overlap seem to fail.
The tests where the fragments do have overlap pass.

See:
https://github.com/zephyrproject-rtos/test_results/issues?q=is%3Aissue+is%3Aopen++Fragment+IPv6

The testsuite used the echo server sample with the config

The bug is reproduced by feeding fragmented packets to the echo server with the config file of:
https://github.com/hakehuang/zephyr/blob/tcp_ip_testing_maxwell/samples/net/sockets/echo_server/overlay-maxwell.conf

Expected behavior
IPv6 fragmenting should work also when segments do not overlap

Impact
At least some the IPv6 fragmenting tests are failing.

@ssharks ssharks added the bug The issue is a bug, or the PR is fixing a bug label Feb 7, 2023
@ssharks
Copy link
Collaborator Author

ssharks commented Feb 7, 2023

@hakehuang Thanks for testing

@nordicjm
Copy link
Collaborator

nordicjm commented Feb 8, 2023

@hakehuang would you be able to check the same functionality but for IPv4 also and see if there are similar issues? Support for IPv4 fragmentation was added recently last year (it can be used/enabled in the current main) never mind, I see there are IPv4 fragment issues too https://github.com/zephyrproject-rtos/test_results/issues?q=is%3Aissue+is%3Aopen++Fragment+IPv4

@nordicjm
Copy link
Collaborator

nordicjm commented Feb 8, 2023

So the test results are from April 2022 - I don't think that accurately represents the current state of the networking stack given that's almost a whole year ago. Can these tests be re-ran? The IPv4 ones also don't actually make sense given that I added IPv4 fragmented packet support to zephyr in November 2022, so it would be fully expected that IPv4 fragmented packet tested would fail in April 2022 because there was no support back then.

@nordicjm
Copy link
Collaborator

nordicjm commented Feb 8, 2023

Comment from IPv4 one which also applies to IPv6 - replace the IPV4 with IPV6 in the Kconfig name:

So I see IPv4 fragmentation Kconfigs are set in that configuration, but NET_IPV4_FRAGMENT_MAX_PKT is not updated from the default of 2, so whilst 16 incoming packets can be fragmented, there can only be 2 fragments per packet. What is actually being used to test here - what is the size of the packets? What is the MTU of the transport? What are the fragment sizes? Because from the linked tests there, nearly all mention 4 fragments, which isn't going to work in the default configuration

Kconfig help text:

Incoming fragments are stored in per-packet queue before being
reassembled. This value defines the number of fragments that
can be handled at the same time to reassemble a single packet.

We do not have to accept IPv6 packets larger than 1500 bytes
(RFC 2460 ch 5). This means that we should receive everything
within the first two fragments. The first one being 1280 bytes and
the second one 220 bytes.

You can increase this value if you expect packets with more
than two fragments.

@rlubos
Copy link
Contributor

rlubos commented Feb 8, 2023

Can these tests be re-ran?

Yes, this was discussed during net forum yesterday, and IPv4 fragmentation test results are going to be updated.

@carlescufi carlescufi added the priority: low Low impact/importance bug label Feb 8, 2023
@hakehuang
Copy link
Collaborator

hakehuang commented Feb 9, 2023

Can these tests be re-ran?

Yes, this was discussed during net forum yesterday, and IPv4 fragmentation test results are going to be updated.

according to latest test, those issues are still, please check the weekly networking test report in mail list

@rlubos
Copy link
Contributor

rlubos commented Mar 17, 2023

I've spent some time trying to trigger the IPv6 defragmentation error... and failed.

Apart from basic tests (sending a fragmented packets from Linux host to Zephyr device), I've also fetched a test case, where I could inject individual fragments into the stack in any order. No matter how I scrambled the fragments, the defragmentation implementation from Zephyr worked just fine. The only condition was that the fragment count could not exceed CONFIG_NET_IPV6_FRAGMENT_MAX_PKT, but that's expected, we don't have unlimited resources.

Now, I've requested some more info in zephyrproject-rtos/test_results#1124, which seemed the most basic one (ideally a wireshark pcap). Let's see if we can get more information out of it. Otherwise, I really don't see a point keeping the issue open, if there's no way to reproduce it.

@hakehuang
Copy link
Collaborator

I've spent some time trying to trigger the IPv6 defragmentation error... and failed.

which platform are you using?

@rlubos
Copy link
Contributor

rlubos commented Mar 20, 2023

I've spent some time trying to trigger the IPv6 defragmentation error... and failed.

which platform are you using?

qemu_x86 over SLIP when testing with Linux host. The test case I've implemented was just an extension of https://github.com/zephyrproject-rtos/zephyr/tree/main/tests/net/ipv6_fragment, no external communication involved.

@hakehuang
Copy link
Collaborator

hakehuang commented Mar 20, 2023

@rlubos, let me try to get the pcap for you to reference.
pcap.zip

@rlubos
Copy link
Contributor

rlubos commented Mar 22, 2023

Thank you @hakehuang for the pcaps.

Now the primary question from my side - I know that you're using qemu_x86 for the tests, but are you also using e1000 driver or a regular slip? I'm asking as using overlay-e1000.conf was the only way for me to reproduce some defragmentation failures. Not because of bugs in the net stack however, but due to driver/qemu_x86/whoever dropping individual fragments.

It brought my attention, that Frag_03-Frag05 tests did not fail due to defragmentation errors, but rather because it could not establish TCP connection (see retransmitted SYN,ACK packets). I've tried then to reproduce the scenario from Frag_06, where I wrote a simple Linux app to send packets copied from wireshark to Zephyr sample, over raw Ethernet socket. For reference, the test case fails, because there is no response for the fragmented TCP SYN packet.
When running this simple test I've observed no issues defragmenting the SYN packet when executed on qemu_x86 (with SLIP) or native_posix. But with qemu_x86 with e1000 driver, the defragmentation fails, simply because one or more fragments don't even reach the e1000 driver (no IRQ call).

Now I'm not sure whether the problems with this driver are due to the driver itself, or some underlying qemu_x86 issues, I've already experience difficulties in the past when running throughput tests (#23302 (comment)). If that's the case however you use this configuration for running maxwell tests, we should most likely consider some other alternative.

Now to summarize the pcaps:

  • Frag_03-Frag05 - there likely are no issues with the defragmentation itself, but rather TCP connection establishment. Most likely due to some packet drops
  • Frag_06 - can't defragment TCP SYN, most likely due to packet drops. In "simulated" scenario, only reproducible with e1000
  • Frag_06_icmp/Frag_06_UDP - those are currently bound to fail, as the fragment count is enormous (128 fragments to reassemble). The current limit configured in echo_server is 8 fragments max. So we need to either consider increasing CONFIG_NET_IPV6_FRAGMENT_MAX_PKT/CONFIG_NET_PKT_RX_COUNT/CONFIG_NET_BUF_RX_COUNT (which should be fine given we have plenty of RAM in qemu_x86) or ignore the failure.

@hakehuang
Copy link
Collaborator

@rlubos Thanks a lot for detailed analysis, I would propose that we can end up to support a benchmark platform for zephyr stacking test. I know there are some issue in the qemu system, but it would also has some other issue for other platform. Besides for now do you have the connection to the e1000 driver on qemu owner?

@rlubos
Copy link
Contributor

rlubos commented Mar 22, 2023

Besides for now do you have the connection to the e1000 driver on qemu owner?

I'm not really sure who's responsible for the driver right now, it has no dedicated maintainer assigned. @jukkar @carlescufi Any ideas?

I would propose that we can end up to support a benchmark platform for zephyr stacking test. I know there are some issue in the qemu system, but it would also has some other issue for other platform.

Perhaps we could try to switch to native_posix? Or use qemu_x86 but with SLIP TAP interface - it is a bit clumsy when it comes to throughputs, but personally, I don't recall having any issues with it.

@hakehuang
Copy link
Collaborator

Perhaps we could try to switch to native_posix? Or use qemu_x86 but with SLIP TAP interface - it is a bit clumsy when it comes to throughputs, but personally, I don't recall having any issues with it.

ok, let me try to use the native_posix? Or use qemu_x86` but with SLIP TAP interface and compare. Thanks for the suggestion.

@rlubos
Copy link
Contributor

rlubos commented Mar 23, 2023

Perhaps we could try to switch to native_posix? Or use qemu_x86 but with SLIP TAP interface - it is a bit clumsy when it comes to throughputs, but personally, I don't recall having any issues with it.

ok, let me try to use the native_posix? Or use qemu_x86` but with SLIP TAP interface and compare. Thanks for the suggestion.

Yes I'd give those two a try. Would be good to see for instance the results for IPv6 fragmentation only and compare with currently used platform. If that was the culprit, I think other areas could've been affected as well by the same problem.

@nordicjm
Copy link
Collaborator

Query: does native_posix do e.g. fragmentation through zephyr or through the linux host's networking stack (or both)?

@rlubos
Copy link
Contributor

rlubos commented Mar 23, 2023

Query: does native_posix do e.g. fragmentation through zephyr or through the linux host's networking stack (or both)?

When I've been testing with native_posix, Zephyr stack did the defragmentation.

@hakehuang
Copy link
Collaborator

Query: does native_posix do e.g. fragmentation through zephyr or through the linux host's networking stack (or both)?

per my understanding, we need set to tup mode in host for native_posix, so the host stack is not impacted

@hakehuang
Copy link
Collaborator

  • Frag_03-Frag05 - there likely are no issues with the defragmentation itself, but rather TCP connection establishment. Most likely due to some packet drops
  • Frag_06 - can't defragment TCP SYN, most likely due to packet drops. In "simulated" scenario, only reproducible with e1000
  • Frag_06_icmp/Frag_06_UDP - those are currently bound to fail, as the fragment count is enormous (128 fragments to reassemble). The current limit configured in echo_server is 8 fragments max. So we need to either consider increasing CONFIG_NET_IPV6_FRAGMENT_MAX_PKT/CONFIG_NET_PKT_RX_COUNT/CONFIG_NET_BUF_RX_COUNT (which should be fine given we have plenty of RAM in qemu_x86) or ignore the failure.

I change to use native_posix and use the overlay_max-stack.conf, Frag_03-05 still fails, and Frag06 still fails.

@rlubos
Copy link
Contributor

rlubos commented Mar 24, 2023

I change to use native_posix and use the overlay_max-stack.conf, Frag_03-05 still fails, and Frag06 still fails.

Any chance for a pcap again? I'd like to compare with the previous results.

@hakehuang
Copy link
Collaborator

Any chance for a pcap again? I'd like to compare with the previous results.

just to update one new errror message from native_posix it shows cannot allocate a new TCP connection, although I have change the CONFIG_NET_SAMPLE_NUM_HANDLERS=20

@rlubos
Copy link
Contributor

rlubos commented Mar 24, 2023

Any chance for a pcap again? I'd like to compare with the previous results.

just to update one new errror message from native_posix it shows cannot allocate a new TCP connection, although I have change the CONFIG_NET_SAMPLE_NUM_HANDLERS=20

Looks like CONFG_NET_MAX_CONTEXTS might need an adjustment. CONFIG_POSIX_MAX_FDS likely as well if the connection count increases. However it's a bit weird why are so many contexts in use simultaneously, perhaps Maxwell do not close connections gracefully?

@hakehuang
Copy link
Collaborator

hakehuang commented Mar 24, 2023

However it's a bit weird why are so many contexts in use simultaneously, perhaps Maxwell do not close connections gracefully?

maybe this is the case, but let me try your suggestion first, btw what is below warnning means?

net_ipv4: no slots available for 0x1d7
net_ipv4: no slots available for 0x1d8
net_ipv4: no slots available for 0x1d9
....

@rlubos the pcap FYI.
fragment_ipv4.zip

@rlubos
Copy link
Contributor

rlubos commented Mar 29, 2023

maybe this is the case, but let me try your suggestion first, btw what is below warnning means?

Maximum number of fragments exceeded - Zephyr can only store up to CONFIG_NET_IPV4_FRAGMENT_MAX_PKT fragments, if more arrive, it'll drop the reassembly.

@rlubos the pcap FYI.
fragment_ipv4.zip

I'm a bit confused right now, those are IPv4 reports (we have a separate issue to track IPv4, #54576). Are there no more IPv6 fragmentation failures?

@nordicjm
Copy link
Collaborator

nordicjm commented Mar 29, 2023

Had a quick look and I think there is an error in both IPv4 and IPv6 defragmentation (common code) here in fragments_are_ready:

	/* Fragments can arrive in any order, for example in reverse order:
	 *   1 -> Fragment3(M=0, offset=x2)
	 *   2 -> Fragment2(M=1, offset=x1)
	 *   3 -> Fragment1(M=1, offset=0)
	 * We have to test several requirements before proceeding with the reassembly:
	 * - We received the first fragment (Fragment Offset is 0)
	 * - All intermediate fragments are contiguous
	 * - The More bit of the last fragment is 0
	 */
	for (i = 0; i < CONFIG_NET_IPV4_FRAGMENT_MAX_PKT; i++) {
...
		more = net_pkt_ipv4_fragment_more(pkt);
	}

	if (more) {
		return 0;
	}

So the more value is set to the value of the last received packet, but the last received packet doesn't have to be the final packet in the chain, in fact if the packets are received as the comment suggests, more will wrongly be set to 1 when it is processed.

I'm not a great fan of how that function runs, it makes a lot of assumptions and only checks if packets received after the first have a higher offset, so if you receive the first packet, the final (3rd) packet then the second packet, that function is going to discard the one is receives last because the offset of that packet is lower than the new expected offset.

Essentially, out of order packets will not work.

I seemingly completely missed the shift_packets function call which happens at a different point in the code, will need to have a play with this manually and see if it is working as expected or not

@rlubos
Copy link
Contributor

rlubos commented Mar 29, 2023

I seemingly completely missed the shift_packets

Yep, was just writing this when I saw your update.

@nordicjm For IPv6, I've specifically tested out-of-order reception, and it worked just fine, as long as I did not exceed the fragment limit, or injected a fragment duplicate (which is considered as an overlap by this implementation).

@carlescufi
Copy link
Member

Unable to reproduce, so closing this until we have solid evidence that an issue exists.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: Networking bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
Development

No branches or pull requests

6 participants