You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This innocent cute little bunny program brutally murders lo2s even with the new thread safety in place.
constexpr int children = 4;
constexpr int generations = 4;
constexpr int threads = 6;
#include <chrono>
#include <thread>
#include <vector>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
using myclock = std::chrono::high_resolution_clock;
myclock::time_point begin;
void work()
{
while (myclock::now() < begin + std::chrono::seconds(5));
std::this_thread::sleep_until(begin + std::chrono::seconds(10));
while (myclock::now() < begin + std::chrono::seconds(15));
}
int main() {
int generation = 0;
begin = myclock::now();
for (int i = 0; i < children; i++)
{
pid_t pid = fork();
if (pid == 0) // child
{
if (generation < generations)
{
generation++;
i = -1;
continue; // more forking
}
break;
}
// parent
}
if (threads > 0)
{
std::vector<std::thread> tv;
for (int t = 0; t < threads; t++)
{
tv.emplace_back(work);
}
for (auto& t : tv)
{
t.join();
}
}
else {
work();
}
}
Run with ulimit -n 524288
First hit (many instances of):
[14641106768213][pid: 64987][tid: 64987][ WARN]: Could not find system tree node for pid 69797
Second hit (also many instances):
[14641047009139][pid: 64987][tid: 64987][ERROR]: Failed to get process containing monitored thread 69740
Third hit (some of those later, repeatedly with the same pid):
[14647409082867][pid: 64987][tid: 64987][ WARN]: Thread 71270 is about to exit, but was never seen before.
And KO (not sure if all of those are related, there is a temporal gap):
[14647411589098][pid: 64987][tid: 64987][ERROR]: perf_event_open for sampling failed
[14647411596838][pid: 64987][tid: 64987][ERROR]: maybe the specified clock is unavailable?
[14647411630168][pid: 64987][tid: 64987][ERROR]: Failure while adding new thread cloned from 71260: No such process
[14650603407932][pid: 64987][tid: 64987][FATAL]: Aborting: No such process
The text was updated successfully, but these errors were encountered:
This seems to be generally caused by us missing PTRACE_EVENT_SOMETHING for the different kinds of creating offspring, for load reasons (my computer reached 500 something load during testing)
[14641106768213][pid: 64987][tid: 64987][ WARN]: Could not find system tree node for pid 69797
We've missed the PTRACE_EVENT_FORK for the parent so there is no system tree node parent. Harmless, and is already recovered by just using system_tree_root_node as the parent instead
Failed to get process containing monitored thread 697401
We've missed the PTRACE_EVENT_SOMETHING for the parent, so that there is no entry for the parent in the tid_to_pid mapping table. This is currently not recovered at all, and prevents the thread whose parent we've missed from being monitored. More optimally we would just make its parent NO_PARENT_PROCESS_PID or something. But this might be hairy, because I don't know how much info we need from the parent process
Thread 71270 is about to exit, but was never seen before
Same deal, we just never got a PTRACE_EVENT_SOMETHING
Aborting: No such process
perf_event_open returned ESRCH, which means that the thread we wanted to sample died before we could start sampling it. This is very obviously not recovered at all currently but should be very well possible to recover from.
This innocent cute little
bunnyprogram brutally murders lo2s even with the new thread safety in place.Run with
ulimit -n 524288
First hit (many instances of):
Second hit (also many instances):
Third hit (some of those later, repeatedly with the same pid):
And KO (not sure if all of those are related, there is a temporal gap):
The text was updated successfully, but these errors were encountered: