-
Notifications
You must be signed in to change notification settings - Fork 602
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CORE-8685 crash_tracker: crash signal handlers for crash recording #24959
CORE-8685 crash_tracker: crash signal handlers for crash recording #24959
Conversation
/ci-repeat 1 |
CI test resultstest results on build#61284
test results on build#61407
test results on build#61578
|
5710b4a
to
008974e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks nice!
src/v/redpanda/application.cc
Outdated
void sigsegv_action() { | ||
crash_tracker::get_recorder().record_crash_sighandler(SIGSEGV); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Noticed that in seastar it ignores SIGSEGV
when SEASTAR_ASAN_ENABLED
is true: https://github.com/scylladb/seastar/blob/dfc804c506655f32650d44fa2f3dafe30414bcac/src/core/reactor.cc#L3998-L4004
Not sure if we need to handle that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I saw this. ASAN installs its own sighandler for sigsegv and prints a bunch of useful info (including the backtrace) when there is a segfault. I think seastar doesn't handle sigsegv when ASAN is enabled because they would overwrite the ASAN handler. And even if they wrapped the ASAN handler instead of overwriting it, their handler would just print the same/similar error message including the backtrace again, which is useless/annoying when the ASAN handler prints the same info.
For redpanda, I think we should capture crash records regardless of whether ASAN is enabled or not. When ASAN is enabled, redpanda's sighandler will wrap the ASAN signal handler instead of seastar's. Which is why I believe we don't have to special case for ASAN here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should capture crash records regardless of whether ASAN is enabled or not. When ASAN is enabled, redpanda's sighandler will wrap the ASAN signal handler instead of seastar's. Which is why I believe we don't have to special case for ASAN here.
Have you tested this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it is important to test that this works, ideally as a checked in test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Have you tested this? ... ideally as a checked in test.
Yes, the ducktape test I've added in this PR (crash_loop_checks_test.py::CrashLoopChecksTest.test_crash_report_with_signal{signo=Signals.SIGSEGV}
) tests this, specifically in the debug-mode run of ducktape where ASAN is enabled.
...
Recorded crash reason to crash file.
AddressSanitizer:DEADLYSIGNAL
=================================================================
==885==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000375 (pc 0x7f32d0a5eac0 bp 0x7ffc4fe0f750 sp 0x7ffc4fe0f730 T0)
==885==The signal is caused by a READ memory access.
==885==Hint: address points to the zero page.
#0 0x7f32d0a5eac0 in std::__1::vector<cluster::node_status, std::__1::allocator<cluster::node_status> >::capacity[abi:ne180100]() const /home/gellertnagy/code/redpanda/vbuild/llvm/3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff/install/bin/../include/c++/v1/vector:598:35
...
==885==ABORTING
src/v/redpanda/application.cc
Outdated
|
||
struct sigaction sa; | ||
sa.sa_sigaction = [](int sig, siginfo_t*, void*) { | ||
std::lock_guard<ss::util::spinlock> g(lock); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not reentrant. I've deadlocked seastar because of the spinlock they have where printing the stacktrace causes another segfault (this can happen if the stack is corrupted). Hung processes really suck.
Can we remove the need for the spinlock? Can we either make it thread local and check for the lock being held first? Or if not thread local then can we check if we are re-entering using a thread local?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not reentrant. Can we remove the need for the spinlock? Can we either make it thread local and check for the lock being held first? Or if not thread local then can we check if we are re-entering using a thread local?
Yes, we can replace the lock with a std::atomic<bool> handled
that we CAS on to decide whether to call Func()
or not. I didn't do it initially because I wanted to deviate as little as possible from the existing battle-tested seastar sighandler implementation. But I agree that there is a possibility of a deadlock in the seastar implementation and it is a simple change, so I've made the change now.
this can happen if the stack is corrupted
Do you have an example of how this can happen by any chance? I'm curious + maybe we could add a test for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you have an example of how this can happen by any chance
I think it's possible if there is a crash while unwinding the stack, I know this used to be an issue when initially developing the Wasm VM into seastar, but I forget the exact scenario now. I do think it had something to do with us not unblocking SIGILL however, or maybe it was aborts inside the Wasm VM? Sorry not recalling what's going on.
src/v/redpanda/application.cc
Outdated
void install_sighandlers() { | ||
install_oneshot_signal_handler<SIGSEGV, sigsegv_action>(); | ||
install_oneshot_signal_handler<SIGABRT, sigabrt_action>(); | ||
install_oneshot_signal_handler<SIGILL, sigill_action>(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Two things, outside of shard 0 SIGILL is blocked by default without Wasm on. WebAssembly removes the sigmask, but only if it's on.
redpanda/src/v/wasm/wasmtime.cc
Lines 1410 to 1418 in 2c68bb9
co_await ss::smp::invoke_on_all([] { | |
// wasmtime needs some signals for it's handling, make sure we | |
// unblock them. | |
auto mask = ss::make_empty_sigset_mask(); | |
sigaddset(&mask, SIGSEGV); | |
sigaddset(&mask, SIGILL); | |
sigaddset(&mask, SIGFPE); | |
ss::throw_pthread_error(::pthread_sigmask(SIG_UNBLOCK, &mask, nullptr)); | |
}); |
Have you tested this with Wasm enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
outside of shard 0 SIGILL is blocked by default without Wasm on
Good catch, thank you, I did not notice this. I've now added an ducktape test to test with crash signals on both shard 0 and shard 1. These would fail before, so to ensure that SIGILL and SIGABRT signals can be recorded even outside of shard 0, I've changed install_oneshot_signal_handler
to unblock the signal on all reactor threads similar to how ss::engine::handle_signal()
does it: https://github.com/redpanda-data/seastar/blob/ec806998fbf62386ee4b81ac27358b40335839ca/src/core/reactor.cc#L723-L743
WebAssembly removes the sigmask, but only if it's on. Have you tested this with Wasm enabled?
The wasm ducktape tests test this I believe + I've now done some manual testing and crash recording is working the same with/without wasm enabled (now that I correctly set up the signal masks). Do you have any pointers to why the wasm engine needs these signals unblocked? I'd like to better understand if there is any edge case we should consider here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
any pointers
It's a bit involved and in the weeds of the VM implementation, but the code is here: https://github.com/bytecodealliance/wasmtime/blob/main/crates/wasmtime/src/runtime/vm/sys/unix/signals.rs
They use it to handle things like out of bounds memory access and stack overflow inside the VM without needing explicit assembly to do the bounds checking.
src/v/redpanda/application.cc
Outdated
void sigsegv_action() { | ||
crash_tracker::get_recorder().record_crash_sighandler(SIGSEGV); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should capture crash records regardless of whether ASAN is enabled or not. When ASAN is enabled, redpanda's sighandler will wrap the ASAN signal handler instead of seastar's. Which is why I believe we don't have to special case for ASAN here.
Have you tested this?
008974e
to
f400776
Compare
force-push: address code review feedback
|
src/v/redpanda/application.cc
Outdated
struct sigaction sa; | ||
sa.sa_sigaction = [](int sig, siginfo_t*, void*) { | ||
// Reinstall the original signal handler | ||
auto r = ::sigaction(sig, &original_sa, nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were not going to crash the process right after this, this has issues overriding the Wasm signal handling (actually I wonder if there is a potential races of reinstalling this then another thread hitting raising a signal).
Yeah thinking more, I actually don't think we want to drop the lock, I think we just need to protect against re-entry. If this function is called concurrently we will probably do the wrong thing or not let the crash handling finish it's work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we were not going to crash the process right after this
We are going to crash the process right after this. The signal is re-raised for the current head as the last step of the signal handler, which ensures that we will not return to the main execution of the thread.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there is a potential races of reinstalling this then another thread hitting raising a signal. ... I actually don't think we want to drop the lock, I think we just need to protect against re-entry. If this function is called concurrently we will probably do the wrong thing or not let the crash handling finish it's work.
It comes down to how we want to handle the case of concurrent signals on multiple threads. We could:
- Spin waiting for the first signal handler to complete. This is what the lock-based implementation does if we install the signal handler after calling
Func()
. - Terminate the process, possibly before a crash report has been written. This is what the current implementation does with installing the signal handler first and then executing
Func()
.
We cannot return from the signal handlers without terminating, since the program cannot continue from where it left off. So we either have to terminate the process or spin when handling a second signal concurrently.
I believe both 1 or 2 are viable. The pros/cons I see are:
- 1 is what seastar does currently, and we haven't had problems with it.
- But with 1 there is technically a risk of deadlock. If two concurrent signals are raised on the same core, then the later signal could sping and starve the earlier one and cause a deadlock. Note that these signals must arrive on the same core but different threads (eg. a reactor thread and a non-reactor thread). They cannot arrive on the same thread because the
[sa.sa](http://sa.sa/)_mask
parameter of the signal handler blocks concurrent signals to the same. - 2 is what this PR implements today. It is async signal safe, but in this highly unlikely scenario of concurrent crashes on multiple cores it may not record a crash report.
I am personally happy with either approach because both are safe (2 is clearly safe while 1 is as safe as the status quo seastar signal handler).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a diff that both allows the crash handler to finish but also prevents deadlock: https://gist.github.com/rockwotj/46f9a6236b5d09221fa69d9964bd68e0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks yeah that's essentially what I meant by option (2) above. I agree that that seems the best approach because it allows for the crash recorder to complete before terminating the process even in the case of concurrent signals.
From your gist I didn't adopt the thread_local
changes, because, as we discussed on Slack, the sa_mask
parameter passed to the sigaction
call when we install the signal handler ensures that all signal delivery is blocked on the pthread that is currently handling a signal.
redpanda/src/v/crash_tracker/signals.cc
Line 57 in c2e59b7
sigfillset(&sa.sa_mask); |
src/v/redpanda/application.cc
Outdated
struct sigaction sa; | ||
sa.sa_sigaction = [](int sig, siginfo_t*, void*) { | ||
// Reinstall the original signal handler | ||
auto r = ::sigaction(sig, &original_sa, nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here is a diff that both allows the crash handler to finish but also prevents deadlock: https://gist.github.com/rockwotj/46f9a6236b5d09221fa69d9964bd68e0
This method is going to be used in signal handler functions to record information about the crash. Its implementation is async signal safe to ensure that it is safe to be called in a signal handler. Ref: https://man7.org/linux/man-pages/man7/signal-safety.7.html
This commit implements installing custom signal handlers for SIGSEGV and SIGABRT signals in redpanda using the `sigaction` system call. These signal handlers implement recording information about the crash for later debugging. Seastar has support for handling signals on the reactor. However, seastar's `ss::engine::handle_signal` is designed for asynchronously handling user signals (e.g., SIGUSR1, SIGINT, SIGHUP) within the reactor framework. This approach is unsuitable for crash signals where the reactor won't run again after the signal handler is done. In such cases, synchronous signal handlers are required to perform minimal, safe actions (e.g., writing out crash data) immediately upon signal reception, without relying on the reactor's asynchronous scheduling. This commit introduces synchronous, one-shot signal handlers for the main crash signals (SIGSEGV, SIGABRT, SIGILL). The implementation closely mirrors Seastar's existing signal handling mechanism for the above crash signals with the following difference: - Wrapping the existing signal handler: The original signal handler is retrieved and stored before installing redpanda's handler, ensuring it can be restored and triggered after our handler executes. This is to continue to call seastar's handlers as well after our handler. - Crash handler action: While seastar's signal handlers print to stderr, our signal handlers instead write a serde-serialized crash description to an open file descriptor. - SIGILL handler: for SIGILL seastar does not have a custom handler, which means that while for SIGSEGV and SIGABRT the original handler is seastar's, for SIGILL it is the default signal handler SIG_DFL which just terminates the process without printing to stderr. A SIGILL handler is introduced to capture `vassert()` crashes, which (among other places) uses the `__builtin_trap()` primitive leading to a SIGILL signal being raised. Seastar's current crash signal handlers (v25.1.x-pre): https://github.com/redpanda-data/seastar/blob/ec806998fbf62386ee4b81ac27358b40335839ca/src/core/reactor.cc#L4169-L4207
... for later reuse.
Allow sending signals to specific shards/threads of redpanda. This is to help with testing when a crash (eg. segfault) occurs on a shard of redpanda other than shard 0.
This adds a test to verify that crash reports captured while handling crash signals (segfaults, sigabrt, sigill) are correctly being reported. It also verifies that the generated crash reports can be read back later and are correctly printed when the crash loop limit is reached. This requires exposing `_tolerate_crashes` from `RedpandaService` to prevent test failures on crash-detecting checks, since for this test, the crashes are expected.
f400776
to
c2e59b7
Compare
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thanks for working through this with me!
This PR implements installing custom signal handlers for SIGSEGV and SIGABRT signals in redpanda using the
sigaction
system call. These signal handlers implement recording information about the crash for later debugging.Seastar has support for handling signals on the reactor. However, seastar's
ss::engine::handle_signal
is designed for asynchronously handling user signals (e.g., SIGUSR1, SIGINT, SIGHUP) within the reactor framework. This approach is unsuitable for crash signals where the reactor won't run again after the signal handler is done. In such cases, synchronous signal handlers are required to perform minimal, safe actions (e.g., writing out crash data) immediately upon signal reception, without relying on the reactor's asynchronous scheduling.This commit introduces synchronous, one-shot signal handlers for the main crash signals (SIGSEGV, SIGABRT, SIGILL). The implementation closely mirrors Seastar's existing signal handling mechanism for the above crash signals with the following difference:
Wrapping the existing signal handler: The original signal handler is retrieved and stored before installing redpanda's handler, ensuring it can be restored and triggered after our handler executes. This is to continue to call seastar's handlers as well after our handler.
Crash handler action: While seastar's signal handlers print to stderr, our signal handlers instead write a serde-serialized crash description to an open file descriptor.
Note that for SIGILL seastar did not have a custom handler, which means that while for SIGSEGV and SIGABRT the original handler is seastar's, for SIGILL it is the default signal handler SIG_DFL which just terminated the process without printing to stderr. A SIGILL handler is introduced to capture
vassert()
crashes, which (among other places) uses the__builtin_trap()
primitive leading to a SIGILL signal being raised.Fixes https://redpandadata.atlassian.net/browse/CORE-8685
Backports Required
Release Notes