-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(tests): Add core dump watch and analysis #94
base: dev
Are you sure you want to change the base?
Conversation
Mhm, it seems I made a mistake with the commits. I need to fix them. |
Dammit, it dropped my suggestions commit when I forced pushed. |
Fixed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did only a minor suggestion. However, I couldn't test the feature. I tried to induce segfaults errors using division by zero or index out of range scenarios in Ganesha tests, but I couldn't see the creation of coredumps in the expected folder or logged information useful about the situation.
Besides, I saw a permission denied error for coredump_restore
function. Below is the screenshot showing the error.
tests/setup_machine.sh
Outdated
@@ -154,6 +154,8 @@ apt_packages=( | |||
libnfsidmap-dev | |||
libnsl-dev | |||
libsqlite3-dev | |||
xfslibs-dev |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To have support for all platforms, new added packages should be added also for Fedora if they are available.
-
xfslibs-dev
package is different in Fedora:xfsprogs-devel
. Therefore, it must also be added to the list ofdnf_packages
. -
In the case of
inotify-tools
package, is common for both platforms. Therefore, it should be moved tocommon_packages
.
What about gdb
package? Is not needed for accessing coredumps info?
In all cases, the list of packages is sorted alphabetically, so this order should be respected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
GDB was probably included as a dependency in other packages, but I've added a explicit reference to it.
You need to allow saunafstests to modify Another way to test this is by setting /sys/kernel/core_pattern manually to $coredump_pattern (which is currently |
I did this manually by adding the same rule you provided in the I set the core_pattern to |
Interesting. I'll need to test it a little bit on an actual VM then. Could you post the entire log for a single test that should coredump? |
@uristdwarf below I share the patch for the test I used to validate the feature and the log generated when running the test. Please, let me know if you need something else. |
Mhm... I don't know if shared objects have the same behavior as executable regarding core dumps, there's a trap log from the kernel which I've not seen before. In any case, it should be also supported for shared objects if possible. Could you quickly check if using the same code in src/main/main.cc produces the expected result? Try any test involving master/chunkserver/metalogger (like |
Also, are you sure you ran the test with $ inotifywait -m "/tmp/" -e create -e moved_to
Setting up watches.
Watches established. Otherwise, if it failed then it should at least print a message stating it could not setup the feature. But from the logs I couldn't see any of the messages. |
Actually no, I don't remember to set this COREDUMP_WATCH=1, sorry 😅 COREDUMP_WATCH=1; saunafs-tests --gtest_filter="GaneshaTests.test_nfs_ganesha_copy*" Is there any other way to setup this variable? |
Need to figure out why this doesn't work on Ubuntu desktop at least, putting back to draft |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
bbf0caa
to
7793f8d
Compare
7793f8d
to
a3bcba6
Compare
This commit aims to improve the behavior and improve the output of test results. When a core dump occurs, it terminates the test immediately (regardless whether it was SaunaFS or not that caused it, the environment is unreliable for accurate test results). It then uses GDB to analyze the core dump(s). There's a few issues with this: First it requires modifying `/proc/sys/kernel/core_pattern`, which will affect the whole system. While I've tried to ensure that the original pattern is restored after failures and core dumps, I'm not completely confident it will. Second is the fact that GDB might not be needed to print the backtrace: `LD_PRELOAD=/lib/libSegFault.so` may be a better alternative, but I didn't have time to test it. Thus this feature is hidden behind a feature flag as experimental until these issues are solved. Co-authored-by: aNeutrino <[email protected]>
a3bcba6
to
035e37a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This commit aims to improve the behavior and improve the output of test results. When a core dump occurs, it terminates the test immediately (regardless whether it was SaunaFS or not that caused it, the environment is unreliable for accurate test results). It then uses GDB to analyze the core dump(s).
There's a few issues with this: First it requires modifying
/proc/sys/kernel/core_pattern
, which will affect the whole system. While I've tried to ensure that the original pattern is restored after failures and core dumps, I'm not completely confident it will. Second is the fact that GDB might not be needed to print the backtrace:LD_PRELOAD=/lib/libSegFault.so
may be a better alternative, but I didn't have time to test it.Thus this feature is hidden behind a feature flag as experimental until these issues are solved.