Skip to content

Commit

Permalink
feat(tests): Add core dump watch and analysis
Browse files Browse the repository at this point in the history
This commit aims to improve the behavior and improve the output of test
results. When a core dump occurs, it terminates the test immediately
(regardless whether it was SaunaFS or not that caused it, the
environment is unreliable for accurate test results). It then uses
GDB to analyze the core dump(s).

There's a few issues with this: First it requires modifying
`/proc/sys/kernel/core_pattern`, which will affect the whole system.
While I've tried to ensure that the original pattern is restored after
failures and core dumps, I'm not completely confident it will. Second is
the fact that GDB might not be needed to print the backtrace:
`LD_PRELOAD=/lib/libSegFault.so` may be a better alternative, but I
didn't have time to test it.

Thus this feature is hidden behind a feature flag as experimental until
these issues are solved.
  • Loading branch information
uristdwarf committed May 27, 2024
1 parent 64acea4 commit 21e7ce8
Show file tree
Hide file tree
Showing 6 changed files with 101 additions and 2 deletions.
12 changes: 12 additions & 0 deletions tests/README
Original file line number Diff line number Diff line change
Expand Up @@ -113,3 +113,15 @@ implemented in the following directory:
<SOURCE_DIRECTORY>/tests/tools

Merry testing!


Experimental features
=====================

These features can be enabled by setting the required ENV variable.

- COREDUMP_WATCH - Watch for any coredumps and terminate the test if found.
Also print out useful information from GDB. Note that this requires setting
`/proc/sys/kernel/core_pattern`. While it try restores the pattern after the
test has ended each time (including failure), you may want to keep an eye on
this and restore it manually to the original (and please file a bug).
9 changes: 9 additions & 0 deletions tests/setup_machine.sh
Original file line number Diff line number Diff line change
Expand Up @@ -154,6 +154,8 @@ apt_packages=(
libnfsidmap-dev
libnsl-dev
libsqlite3-dev
xfslibs-dev
inotify-tools
)
dnf_packages=(
boost-filesystem
Expand Down Expand Up @@ -371,6 +373,13 @@ if [ ! -f /etc/sudoers.d/saunafstest ] || ! grep -q '# Client' /etc/sudoers.d/sa
END
fi

if [ ! -f /etc/sudoers.d/saunafstest ] || ! grep -q '# Core dumps' /etc/sudoers.d/saunafstest >/dev/null; then
cat <<-'END' >>/etc/sudoers.d/saunafstest
# Core dumps
saunafstest ALL = NOPASSWD: /usr/bin/tee /proc/sys/kernel/core_pattern
END
fi

echo ; echo 'Fixing GIDs of users'
for name in saunafstest saunafstest_{0..9}; do
uid=$(getent passwd "${name}" | cut -d: -f3)
Expand Down
68 changes: 68 additions & 0 deletions tests/tools/gdb.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,68 @@
coredump_enabled_=
coredump_dir="/tmp/temp-cores/"
coredump_dir_after="/tmp/sfs-cores/"
coredump_pattern="${coredump_dir}core-%e-%p-%t"
coredump_original_pattern=$(cat /proc/sys/kernel/core_pattern)
coredump_pattern_restore=1


coredump_setup() {
if valgrind_enabled; then
return
fi

mkdir -p $coredump_dir
chmod 777 $coredump_dir
if [[ $coredump_original_pattern == $coredump_pattern ]]; then
echo "Core pattern is already set, not modifying it"
coredump_enabled_=1
coredump_pattern_restore=0
return
fi

echo $coredump_pattern | sudo tee /proc/sys/kernel/core_pattern || echo "Could not setup coredump" && return
$coredump_enabled_ = 1
}

coredump_exists() {
if [ -n "$(ls -A $coredump_dir 2> /dev/null)" ]; then
return 0
fi
return 1
}

coredump_is_enabled() {
test -z $coredump_enabled_ && return 1 || return 0
}

coredump_watcher() {
inotifywait -m $coredump_dir -e create -e moved_to |
while read path action file; do
test_add_failure " --- CORE DUMP DETECTED, TERMINATING TEST ---"
test_freeze_result
coredump_analyze
coredump_restore
killall -9 -u $(whoami)
done
}

coredump_analyze() {
mkdir $coredump_dir_after
for core in $coredump_dir/core*; do
echo " --- CORE DUMP BACKTRACE: ${core} --- "
executable=$(gdb -ex "core-file ${core}" -ex "info proc" -ex "quit" \
| grep 'Core was generated by' \
| sed "s/Core was generated by \`\\(.*\\)'\./\\1/" \
| awk '{print $1}') 2> /dev/null

gdb -batch -ex "core-file ${core}" -ex "thread apply all bt full" -ex "quit" ${executable} 2> /dev/null
echo " --- CORE DUMP FINISHED FOR: ${core} --- "
mv $core $coredump_dir_after
done
}

coredump_restore() {
if [[ $coredump_pattern_restore ]]; then
echo $coredump_original_pattern > /proc/sys/kernel/core_pattern || true
fi
}
9 changes: 7 additions & 2 deletions tests/tools/saunafs.sh
Original file line number Diff line number Diff line change
Expand Up @@ -38,8 +38,13 @@ setup_local_empty_saunafs() {
export ZONED_DISKS

# Try to enable core dumps if possible
if [[ $(ulimit -c) == 0 ]]; then
ulimit -c unlimited || ulimit -c 100000000 || ulimit -c 1000000 || ulimit -c 10000 || :
ulimit -c unlimited || ulimit -c 100000000 || ulimit -c 1000000 || ulimit -c 10000 || :
# Try to enable coredump analysis
if [[ ! -z ${COREDUMP_WATCH:-} ]]; then
coredump_setup
if coredump_is_enabled; then
( coredump_watcher & )
fi
fi

# Prepare directories for SaunaFS
Expand Down
4 changes: 4 additions & 0 deletions tests/tools/test.sh
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ test_end() {
if [[ ${DEBUG} ]]; then
set +x
fi
coredump_restore
test_freeze_result
# some tests may leave pwd at sfs mount point, causing a lockup when we stop sfs
cd
Expand Down Expand Up @@ -227,5 +228,8 @@ catch_error_() {
# print_stack 1 removes catch_error_ from stack trace
local stack=$(print_stack 1)
local command=$(get_source_line "$file" "$line")
if coredump_exists; then
sleep infinity
fi
test_add_failure "Command '$command' failed $location"$'\nBacktrace:\n'"$stack"
}
1 change: 1 addition & 0 deletions tests/tools/test_main.sh
Original file line number Diff line number Diff line change
Expand Up @@ -31,3 +31,4 @@ done
. tools/color.sh
. tools/continuous_test.sh
. tools/logs.sh
. tools/gdb.sh

0 comments on commit 21e7ce8

Please sign in to comment.