Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

segmentation fault on latest release on high speed traffic ... #41

Open
ulysse31 opened this issue Jul 26, 2024 · 3 comments
Open

segmentation fault on latest release on high speed traffic ... #41

ulysse31 opened this issue Jul 26, 2024 · 3 comments

Comments

@ulysse31
Copy link

Hello,

I'm using SELKS project docker install, which is based on docker image jasonish/suricata:master-amd64.
I have two nodes running the same install (SELKS).
I updated today both instances (all docker ocntainers, including suricata), and for some unknown reason, one of the two had suricata container crashing in loop (after around a min run).
I firstly though on a SELKS issue, potentially related on rule generation ... but even after wiping all containers / image / volumes / data ... the suricata container still crash loop with a segmentation fault...

[Fri Jul 26 10:44:31 2024] W#06-bond1[78735]: segfault at 0 ip 00000000009349a9 sp 00007f853fffc270 error 4 in suricata[4d4000+637000] likely on CPU 22 (core 14, socket 0)
[Fri Jul 26 10:44:31 2024] Code: 74 24 50 48 85 f6 74 0b ba 01 00 00 00 ff 15 76 8c 44 00 48 89 df e8 06 06 ba ff 0f 0b 0f 1f 40 00 48 83 ec 18 48 85 d2 74 38 <0f> b6 06 89 c1 83 e1 1f 41 b8 01 00 00 00 83 f9 1f 75 5b 48 83 fa

you'll find an output of the docker log suricata -f

suricata_docker_output.txt

Last lines being :

Perf: af-packet: bond1: rx ring: block_size=32768 block_nr=2 frame_size=1600 frame_nr=40 [AFPComputeRingParams:source-af-packet.c:1598]
Perf: af-packet: bond1: rx ring: block_size=32768 block_nr=2 frame_size=1600 frame_nr=40 [AFPComputeRingParams:source-af-packet.c:1598]
Perf: af-packet: bond1: rx ring: block_size=32768 block_nr=2 frame_size=1600 frame_nr=40 [AFPComputeRingParams:source-af-packet.c:1598]
Perf: af-packet: bond1: rx ring: block_size=32768 block_nr=2 frame_size=1600 frame_nr=40 [AFPComputeRingParams:source-af-packet.c:1598]
Notice: threads: Threads created -> W: 64 FM: 1 FR: 1 Engine started. [TmThreadWaitOnThreadRunning:tm-threads.c:1905]

And after that, comes the dmesg segmentation error, and the container crash then boot loop ...

The only difference between the two servers, is that one is using a bonding interface to listen to (bond1), and the other one, listens directly to a physical one ...
So from what I see, it can be either something related to the recent update on the suricata image (11hours ago), or potentially a hw issue ? but that seems unlikely because there is no error message on host and on switch ...

Is there a possibility that the latest version would have issues on 10Gbit interface bondigs ?
Do you have any additional debug that would give more hints ?
Thanks a lot.

@ulysse31
Copy link
Author

UPDATE:

I was thinking that it may be related to bonding ...
But it seems that it does also segmentation fault on the other "interface direct" server :

[Fri Jul 26 06:24:41 2024] W#09-eno2np1[3532764]: segfault at 0 ip 00000000009349a9 sp 00007f1f5fffc270 error 4 in suricata[4d4000+637000] likely on CPU 4 (core 1, socket 0)
[Fri Jul 26 06:24:41 2024] Code: 74 24 50 48 85 f6 74 0b ba 01 00 00 00 ff 15 76 8c 44 00 48 89 df e8 06 06 ba ff 0f 0b 0f 1f 40 00 48 83 ec 18 48 85 d2 74 38 <0f> b6 06 89 c1 83 e1 1f 41 b8 01 00 00 00 83 f9 1f 75 5b 48 83 fa

This one is New york time zone (the other one is Paris timezone)
So it segmentation fault on both ... but the big difference is potentially on the bandwidth: one is a single 10Gbps interface, the other one is a bonding of 2 10Gbps interface, because of the traffic volume.
So, to reformulate, the latest version of docker suricata, seems to segmentation fault on High traffic (average 20MBytes/s on bond1)
The other one in New York is right now arround 2/3Mbytes/s (low activity / early morning)

@ulysse31
Copy link
Author

UPDATE2:

Confirmed after traffic waking up in New York ...

[Fri Jul 26 08:10:34 2024] W#31-eno2np1[3671915]: segfault at 0 ip 00000000009349a9 sp 00007f31ad4f1270 error 4 in suricata[4d4000+637000] likely on CPU 6 (core 6, socket 0)
[Fri Jul 26 08:10:34 2024] Code: 74 24 50 48 85 f6 74 0b ba 01 00 00 00 ff 15 76 8c 44 00 48 89 df e8 06 06 ba ff 0f 0b 0f 1f 40 00 48 83 ec 18 48 85 d2 74 38 <0f> b6 06 89 c1 83 e1 1f 41 b8 01 00 00 00 83 f9 1f 75 5b 48 83 fa

Seems that docker suricata no longer support high traffic and crashes on high traffic ...

@ulysse31 ulysse31 changed the title segmentation fault on latest release of master-amd64 segmentation fault on latest release on high speed traffic ... Jul 26, 2024
@ulysse31
Copy link
Author

ulysse31 commented Jul 26, 2024

UPDATE3:

Updated the title, since I can now confirm that the segmentation fault / crash appear starting from a certain traffic activity on both of my test systems ...
I've tried master-amd64, master-profiling, master ... they all do the same segmentation fault crash loop on high traffic ...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant