Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test test_arena_constraints sporadically hangs on Apple Silicon #756

Open
phprus opened this issue Jan 27, 2022 · 23 comments
Open

Test test_arena_constraints sporadically hangs on Apple Silicon #756

phprus opened this issue Jan 27, 2022 · 23 comments

Comments

@phprus
Copy link
Contributor

phprus commented Jan 27, 2022

Commit: cd6a5f9

Compiler:

Apple clang version 13.0.0 (clang-1300.0.29.30)
Target: arm64-apple-darwin21.2.0
Thread model: posix

Debug build.

Test output:

[doctest] doctest version is "2.4.7"
[doctest] run with "--help" for options





^C===============================================================================
/Users/phprus/Devel/oneapi-src/tmp/oneTBB-cd6a5f9f4a5bae9fc157fa03093c17b9f861c9f2/test/tbb/test_arena_constraints.cpp:112:
TEST CASE:  Test memory leaks

/Users/phprus/Devel/oneapi-src/tmp/oneTBB-cd6a5f9f4a5bae9fc157fa03093c17b9f861c9f2/test/tbb/test_arena_constraints.cpp:112: FATAL ERROR: test case CRASHED: SIGINT - Terminal interrupt signal

===============================================================================
[doctest] test cases: 1 | 0 passed | 1 failed | 3 skipped
[doctest] assertions: 0 | 0 passed | 0 failed |
[doctest] Status: FAILURE!

lldb:

phprus@mbp debug % lldb
(lldb) process attach -p 28972
Process 28972 stopped
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
    frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
libsystem_kernel.dylib`__semwait_signal:
->  0x1a26d0ebc <+8>:  b.lo   0x1a26d0edc               ; <+40>
    0x1a26d0ec0 <+12>: pacibsp
    0x1a26d0ec4 <+16>: stp    x29, x30, [sp, #-0x10]!
    0x1a26d0ec8 <+20>: mov    x29, sp
Target 0: (test_arena_constraints) stopped.

Executable module set to "/Users/phprus/Devel/oneapi-src/tmp/oneTBB-cd6a5f9f4a5bae9fc157fa03093c17b9f861c9f2/build/debug/appleclang_13.0_cxx17_64_debug/test_arena_constraints".
Architecture set to: arm64e-apple-macosx-.
(lldb) bt
* thread #1, queue = 'com.apple.main-thread', stop reason = signal SIGSTOP
  * frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
    frame #1: 0x00000001a25dbd88 libsystem_c.dylib`nanosleep + 216
    frame #2: 0x00000001a2664820 libc++.1.dylib`std::__1::this_thread::sleep_for(std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > const&) + 84
    frame #3: 0x0000000100a0ded8 test_arena_constraints`void std::__1::this_thread::sleep_for<long long, std::__1::ratio<1l, 1000000l> >(__d=0x000000016f411fa0) at thread:386:9
    frame #4: 0x0000000100a0ddd0 test_arena_constraints`void utils::SpinWaitWhile<void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'()>(pred=(anonymous class) @ 0x000000016f411fb0) at spin_barrier.h:49:13
    frame #5: 0x0000000100a0dd30 test_arena_constraints`void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(location=0x000000016f412730, comp=(anonymous class) @ 0x000000016f411fe8) at spin_barrier.h:60:5
    frame #6: 0x0000000100a0dcf4 test_arena_constraints`void utils::SpinWaitWhileEq<unsigned long, unsigned long>(location=0x000000016f412730, value=0) at spin_barrier.h:67:5
    frame #7: 0x0000000100a0dc10 test_arena_constraints`void utils::WaitWhileEq::operator(this=0x000000016f4120cf, location=0x000000016f412730, value=0)<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long) const at spin_barrier.h:84:9
    frame #8: 0x0000000100a0da90 test_arena_constraints`bool utils::SpinBarrier::customWait<utils::WaitWhileEq, utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onWaitCallback=0x000000016f4120cf, onOpenBarrierCallback=0x000000016f4120f7) at spin_barrier.h:140:13
    frame #9: 0x0000000100a0d9ac test_arena_constraints`bool utils::SpinBarrier::wait<utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onOpenBarrierCallback=0x000000016f4120f7) at spin_barrier.h:159:16
    frame #10: 0x0000000100a0d978 test_arena_constraints`utils::SpinBarrier::wait(this=0x000000016f412720) at spin_barrier.h:163:16
    frame #11: 0x0000000100a0d950 test_arena_constraints`DOCTEST_ANON_FUNC_38(this=0x000000010115bd58, (null)=0x000000010115bd40)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&)::operator()(tbb::detail::d1::blocked_range<unsigned long> const&) const at test_arena_constraints.cpp:133:33
    frame #12: 0x0000000100a0d388 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::run_body(this=0x000000010115bd00, r=0x000000010115bd40) at parallel_for.h:119:9
    frame #13: 0x0000000100a0c644 test_arena_constraints`void tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x000000010115bd68, start=0x000000010115bd00, range=0x000000010115bd40, ed=0x00000001010d9688)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:447:19
    frame #14: 0x0000000100a0c1b4 test_arena_constraints`void tbb::detail::d1::partition_type_base<tbb::detail::d1::auto_partition_type>::execute<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x000000010115bd68, start=0x000000010115bd00, range=0x000000010115bd40, ed=0x00000001010d9688)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:288:16
    frame #15: 0x0000000100a0be40 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::execute(this=0x000000010115bd00, ed=0x00000001010d9688) at parallel_for.h:172:18
    frame #16: 0x0000000100ea9d64 libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::external_waiter>(tbb::detail::d1::task*, tbb::detail::r1::external_waiter&) + 884
    frame #17: 0x0000000100ea62fc libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::external_waiter>(tbb::detail::d1::task*, tbb::detail::r1::external_waiter&) + 80
    frame #18: 0x0000000100ea6038 libtbb_debug.12.6.dylib`tbb::detail::r1::task_dispatcher::execute_and_wait(tbb::detail::d1::task*, tbb::detail::d1::wait_context&, tbb::detail::d1::task_group_context&) + 180
    frame #19: 0x0000000100ea5f74 libtbb_debug.12.6.dylib`tbb::detail::r1::execute_and_wait(tbb::detail::d1::task&, tbb::detail::d1::task_group_context&, tbb::detail::d1::wait_context&, tbb::detail::d1::task_group_context&) + 64
    frame #20: 0x0000000100a0bb04 test_arena_constraints`tbb::detail::d1::execute_and_wait(t=0x000000010115bd00, t_ctx=0x000000016f412608, wait_ctx=0x000000016f412580, w_ctx=0x000000016f412608) at _task.h:191:5
    frame #21: 0x0000000100a0b7d0 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::run(range=0x000000016f4126f8, body=0x000000016f4126f0, partitioner=0x000000016f4126bf, context=0x000000016f412608)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&) const&, tbb::detail::d1::auto_partitioner const&, tbb::detail::d1::task_group_context&) at parallel_for.h:114:13
    frame #22: 0x0000000100a0b684 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::run(range=0x000000016f4126f8, body=0x000000016f4126f0, partitioner=0x000000016f4126bf)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&) const&, tbb::detail::d1::auto_partitioner const&) at parallel_for.h:103:9
    frame #23: 0x0000000100a0b49c test_arena_constraints`void tbb::detail::d1::parallel_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&)>(range=0x000000016f4126f8, body=0x000000016f4126f0)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&) const&) at parallel_for.h:231:5
    frame #24: 0x0000000100a0b354 test_arena_constraints`DOCTEST_ANON_FUNC_38(this=0x000000016f412b37)::$_6::operator()() const at test_arena_constraints.cpp:130:17
    frame #25: 0x0000000100a0b288 test_arena_constraints`tbb::detail::d1::task_arena_function<DOCTEST_ANON_FUNC_38()::$_6, void>::operator(this=0x000000016f412a50)() const at task_arena.h:68:9
    frame #26: 0x0000000100e79e74 libtbb_debug.12.6.dylib`tbb::detail::r1::task_arena_impl::execute(tbb::detail::d1::task_arena_base&, tbb::detail::d1::delegate_base&) + 1004
    frame #27: 0x0000000100e79a7c libtbb_debug.12.6.dylib`tbb::detail::r1::execute(tbb::detail::d1::task_arena_base&, tbb::detail::d1::delegate_base&) + 32
    frame #28: 0x0000000100a0a930 test_arena_constraints`void tbb::detail::d1::task_arena::execute_impl<void, DOCTEST_ANON_FUNC_38()::$_6>(this=0x000000016f412b38, f=0x000000016f412b37)::$_6&) at task_arena.h:255:9
    frame #29: 0x0000000100a0a704 test_arena_constraints`decltype(this=0x000000016f412b38, f=0x000000016f412b37)) tbb::detail::d1::task_arena::execute<DOCTEST_ANON_FUNC_38()::$_6>(DOCTEST_ANON_FUNC_38()::$_6&&) at task_arena.h:412:16
    frame #30: 0x00000001009fbb14 test_arena_constraints`DOCTEST_ANON_FUNC_38() at test_arena_constraints.cpp:127:19
    frame #31: 0x00000001009f8e24 test_arena_constraints`doctest::Context::run(this=0x000000016f413498) at doctest.h:6724:21
    frame #32: 0x00000001009fabac test_arena_constraints`main(argc=1, argv=0x000000016f413620) at doctest.h:6809:71
    frame #33: 0x0000000100cb90f4 dyld`start + 520
(lldb) thread select 2
* thread #2
    frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
libsystem_kernel.dylib`__semwait_signal:
->  0x1a26d0ebc <+8>:  b.lo   0x1a26d0edc               ; <+40>
    0x1a26d0ec0 <+12>: pacibsp
    0x1a26d0ec4 <+16>: stp    x29, x30, [sp, #-0x10]!
    0x1a26d0ec8 <+20>: mov    x29, sp
(lldb) bt
* thread #2
  * frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
    frame #1: 0x00000001a25dbd88 libsystem_c.dylib`nanosleep + 216
    frame #2: 0x00000001a2664820 libc++.1.dylib`std::__1::this_thread::sleep_for(std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > const&) + 84
    frame #3: 0x0000000100a0ded8 test_arena_constraints`void std::__1::this_thread::sleep_for<long long, std::__1::ratio<1l, 1000000l> >(__d=0x000000016f81a9f0) at thread:386:9
    frame #4: 0x0000000100a0ddd0 test_arena_constraints`void utils::SpinWaitWhile<void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'()>(pred=(anonymous class) @ 0x000000016f81aa00) at spin_barrier.h:49:13
    frame #5: 0x0000000100a0dd30 test_arena_constraints`void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(location=0x000000016f412730, comp=(anonymous class) @ 0x000000016f81aa38) at spin_barrier.h:60:5
    frame #6: 0x0000000100a0dcf4 test_arena_constraints`void utils::SpinWaitWhileEq<unsigned long, unsigned long>(location=0x000000016f412730, value=0) at spin_barrier.h:67:5
    frame #7: 0x0000000100a0dc10 test_arena_constraints`void utils::WaitWhileEq::operator(this=0x000000016f81ab1f, location=0x000000016f412730, value=0)<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long) const at spin_barrier.h:84:9
    frame #8: 0x0000000100a0da90 test_arena_constraints`bool utils::SpinBarrier::customWait<utils::WaitWhileEq, utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onWaitCallback=0x000000016f81ab1f, onOpenBarrierCallback=0x000000016f81ab47) at spin_barrier.h:140:13
    frame #9: 0x0000000100a0d9ac test_arena_constraints`bool utils::SpinBarrier::wait<utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onOpenBarrierCallback=0x000000016f81ab47) at spin_barrier.h:159:16
    frame #10: 0x0000000100a0d978 test_arena_constraints`utils::SpinBarrier::wait(this=0x000000016f412720) at spin_barrier.h:163:16
    frame #11: 0x0000000100a0d950 test_arena_constraints`DOCTEST_ANON_FUNC_38(this=0x000000010110bb58, (null)=0x000000010110bb40)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&)::operator()(tbb::detail::d1::blocked_range<unsigned long> const&) const at test_arena_constraints.cpp:133:33
    frame #12: 0x0000000100a0d388 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::run_body(this=0x000000010110bb00, r=0x000000010110bb40) at parallel_for.h:119:9
    frame #13: 0x0000000100a0c644 test_arena_constraints`void tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x000000010110bb68, start=0x000000010110bb00, range=0x000000010110bb40, ed=0x00000001010d9708)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:447:19
    frame #14: 0x0000000100a0c1b4 test_arena_constraints`void tbb::detail::d1::partition_type_base<tbb::detail::d1::auto_partition_type>::execute<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x000000010110bb68, start=0x000000010110bb00, range=0x000000010110bb40, ed=0x00000001010d9708)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:288:16
    frame #15: 0x0000000100a0be40 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::execute(this=0x000000010110bb00, ed=0x00000001010d9708) at parallel_for.h:172:18
    frame #16: 0x0000000100e8129c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 884
    frame #17: 0x0000000100e77a4c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 80
    frame #18: 0x0000000100e772e8 libtbb_debug.12.6.dylib`tbb::detail::r1::arena::process(tbb::detail::r1::thread_data&) + 484
    frame #19: 0x0000000100e965fc libtbb_debug.12.6.dylib`tbb::detail::r1::market::process(rml::job&) + 100
    frame #20: 0x0000000100e9f014 libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::run() + 120
    frame #21: 0x0000000100e9ef7c libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::thread_routine(void*) + 44
    frame #22: 0x00000001a2709240 libsystem_pthread.dylib`_pthread_start + 148
(lldb) thread select 3
* thread #3
    frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
libsystem_kernel.dylib`__semwait_signal:
->  0x1a26d0ebc <+8>:  b.lo   0x1a26d0edc               ; <+40>
    0x1a26d0ec0 <+12>: pacibsp
    0x1a26d0ec4 <+16>: stp    x29, x30, [sp, #-0x10]!
    0x1a26d0ec8 <+20>: mov    x29, sp
(lldb) bt
* thread #3
  * frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
    frame #1: 0x00000001a25dbd88 libsystem_c.dylib`nanosleep + 216
    frame #2: 0x00000001a2664820 libc++.1.dylib`std::__1::this_thread::sleep_for(std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > const&) + 84
    frame #3: 0x0000000100a0ded8 test_arena_constraints`void std::__1::this_thread::sleep_for<long long, std::__1::ratio<1l, 1000000l> >(__d=0x000000016fc269f0) at thread:386:9
    frame #4: 0x0000000100a0ddd0 test_arena_constraints`void utils::SpinWaitWhile<void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'()>(pred=(anonymous class) @ 0x000000016fc26a00) at spin_barrier.h:49:13
    frame #5: 0x0000000100a0dd30 test_arena_constraints`void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(location=0x000000016f412730, comp=(anonymous class) @ 0x000000016fc26a38) at spin_barrier.h:60:5
    frame #6: 0x0000000100a0dcf4 test_arena_constraints`void utils::SpinWaitWhileEq<unsigned long, unsigned long>(location=0x000000016f412730, value=0) at spin_barrier.h:67:5
    frame #7: 0x0000000100a0dc10 test_arena_constraints`void utils::WaitWhileEq::operator(this=0x000000016fc26b1f, location=0x000000016f412730, value=0)<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long) const at spin_barrier.h:84:9
    frame #8: 0x0000000100a0da90 test_arena_constraints`bool utils::SpinBarrier::customWait<utils::WaitWhileEq, utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onWaitCallback=0x000000016fc26b1f, onOpenBarrierCallback=0x000000016fc26b47) at spin_barrier.h:140:13
    frame #9: 0x0000000100a0d9ac test_arena_constraints`bool utils::SpinBarrier::wait<utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onOpenBarrierCallback=0x000000016fc26b47) at spin_barrier.h:159:16
    frame #10: 0x0000000100a0d978 test_arena_constraints`utils::SpinBarrier::wait(this=0x000000016f412720) at spin_barrier.h:163:16
    frame #11: 0x0000000100a0d950 test_arena_constraints`DOCTEST_ANON_FUNC_38(this=0x0000000101123c58, (null)=0x0000000101123c40)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&)::operator()(tbb::detail::d1::blocked_range<unsigned long> const&) const at test_arena_constraints.cpp:133:33
    frame #12: 0x0000000100a0d388 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::run_body(this=0x0000000101123c00, r=0x0000000101123c40) at parallel_for.h:119:9
    frame #13: 0x0000000100a0c644 test_arena_constraints`void tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x0000000101123c68, start=0x0000000101123c00, range=0x0000000101123c40, ed=0x00000001010d9788)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:447:19
    frame #14: 0x0000000100a0c1b4 test_arena_constraints`void tbb::detail::d1::partition_type_base<tbb::detail::d1::auto_partition_type>::execute<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x0000000101123c68, start=0x0000000101123c00, range=0x0000000101123c40, ed=0x00000001010d9788)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:288:16
    frame #15: 0x0000000100a0be40 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::execute(this=0x0000000101123c00, ed=0x00000001010d9788) at parallel_for.h:172:18
    frame #16: 0x0000000100e8129c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 884
    frame #17: 0x0000000100e77a4c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 80
    frame #18: 0x0000000100e772e8 libtbb_debug.12.6.dylib`tbb::detail::r1::arena::process(tbb::detail::r1::thread_data&) + 484
    frame #19: 0x0000000100e965fc libtbb_debug.12.6.dylib`tbb::detail::r1::market::process(rml::job&) + 100
    frame #20: 0x0000000100e9f014 libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::run() + 120
    frame #21: 0x0000000100e9ef7c libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::thread_routine(void*) + 44
    frame #22: 0x00000001a2709240 libsystem_pthread.dylib`_pthread_start + 148
(lldb) thread select 4
* thread #4
    frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
libsystem_kernel.dylib`__semwait_signal:
->  0x1a26d0ebc <+8>:  b.lo   0x1a26d0edc               ; <+40>
    0x1a26d0ec0 <+12>: pacibsp
    0x1a26d0ec4 <+16>: stp    x29, x30, [sp, #-0x10]!
    0x1a26d0ec8 <+20>: mov    x29, sp
(lldb) bt
* thread #4
  * frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
    frame #1: 0x00000001a25dbd88 libsystem_c.dylib`nanosleep + 216
    frame #2: 0x00000001a2664820 libc++.1.dylib`std::__1::this_thread::sleep_for(std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > const&) + 84
    frame #3: 0x0000000100a0ded8 test_arena_constraints`void std::__1::this_thread::sleep_for<long long, std::__1::ratio<1l, 1000000l> >(__d=0x00000001700329f0) at thread:386:9
    frame #4: 0x0000000100a0ddd0 test_arena_constraints`void utils::SpinWaitWhile<void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'()>(pred=(anonymous class) @ 0x0000000170032a00) at spin_barrier.h:49:13
    frame #5: 0x0000000100a0dd30 test_arena_constraints`void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(location=0x000000016f412730, comp=(anonymous class) @ 0x0000000170032a38) at spin_barrier.h:60:5
    frame #6: 0x0000000100a0dcf4 test_arena_constraints`void utils::SpinWaitWhileEq<unsigned long, unsigned long>(location=0x000000016f412730, value=0) at spin_barrier.h:67:5
    frame #7: 0x0000000100a0dc10 test_arena_constraints`void utils::WaitWhileEq::operator(this=0x0000000170032b1f, location=0x000000016f412730, value=0)<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long) const at spin_barrier.h:84:9
    frame #8: 0x0000000100a0da90 test_arena_constraints`bool utils::SpinBarrier::customWait<utils::WaitWhileEq, utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onWaitCallback=0x0000000170032b1f, onOpenBarrierCallback=0x0000000170032b47) at spin_barrier.h:140:13
    frame #9: 0x0000000100a0d9ac test_arena_constraints`bool utils::SpinBarrier::wait<utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onOpenBarrierCallback=0x0000000170032b47) at spin_barrier.h:159:16
    frame #10: 0x0000000100a0d978 test_arena_constraints`utils::SpinBarrier::wait(this=0x000000016f412720) at spin_barrier.h:163:16
    frame #11: 0x0000000100a0d950 test_arena_constraints`DOCTEST_ANON_FUNC_38(this=0x000000010115b758, (null)=0x000000010115b740)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&)::operator()(tbb::detail::d1::blocked_range<unsigned long> const&) const at test_arena_constraints.cpp:133:33
    frame #12: 0x0000000100a0d388 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::run_body(this=0x000000010115b700, r=0x000000010115b740) at parallel_for.h:119:9
    frame #13: 0x0000000100a0c644 test_arena_constraints`void tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x000000010115b768, start=0x000000010115b700, range=0x000000010115b740, ed=0x00000001010d9988)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:447:19
    frame #14: 0x0000000100a0c1b4 test_arena_constraints`void tbb::detail::d1::partition_type_base<tbb::detail::d1::auto_partition_type>::execute<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x000000010115b768, start=0x000000010115b700, range=0x000000010115b740, ed=0x00000001010d9988)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:288:16
    frame #15: 0x0000000100a0be40 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::execute(this=0x000000010115b700, ed=0x00000001010d9988) at parallel_for.h:172:18
    frame #16: 0x0000000100e8129c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 884
    frame #17: 0x0000000100e77a4c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 80
    frame #18: 0x0000000100e772e8 libtbb_debug.12.6.dylib`tbb::detail::r1::arena::process(tbb::detail::r1::thread_data&) + 484
    frame #19: 0x0000000100e965fc libtbb_debug.12.6.dylib`tbb::detail::r1::market::process(rml::job&) + 100
    frame #20: 0x0000000100e9f014 libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::run() + 120
    frame #21: 0x0000000100e9ef7c libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::thread_routine(void*) + 44
    frame #22: 0x00000001a2709240 libsystem_pthread.dylib`_pthread_start + 148
(lldb) thread select 5
* thread #5
    frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
libsystem_kernel.dylib`__semwait_signal:
->  0x1a26d0ebc <+8>:  b.lo   0x1a26d0edc               ; <+40>
    0x1a26d0ec0 <+12>: pacibsp
    0x1a26d0ec4 <+16>: stp    x29, x30, [sp, #-0x10]!
    0x1a26d0ec8 <+20>: mov    x29, sp
(lldb) bt
* thread #5
  * frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
    frame #1: 0x00000001a25dbd88 libsystem_c.dylib`nanosleep + 216
    frame #2: 0x00000001a2664820 libc++.1.dylib`std::__1::this_thread::sleep_for(std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > const&) + 84
    frame #3: 0x0000000100a0ded8 test_arena_constraints`void std::__1::this_thread::sleep_for<long long, std::__1::ratio<1l, 1000000l> >(__d=0x000000017043e9f0) at thread:386:9
    frame #4: 0x0000000100a0ddd0 test_arena_constraints`void utils::SpinWaitWhile<void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'()>(pred=(anonymous class) @ 0x000000017043ea00) at spin_barrier.h:49:13
    frame #5: 0x0000000100a0dd30 test_arena_constraints`void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(location=0x000000016f412730, comp=(anonymous class) @ 0x000000017043ea38) at spin_barrier.h:60:5
    frame #6: 0x0000000100a0dcf4 test_arena_constraints`void utils::SpinWaitWhileEq<unsigned long, unsigned long>(location=0x000000016f412730, value=0) at spin_barrier.h:67:5
    frame #7: 0x0000000100a0dc10 test_arena_constraints`void utils::WaitWhileEq::operator(this=0x000000017043eb1f, location=0x000000016f412730, value=0)<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long) const at spin_barrier.h:84:9
    frame #8: 0x0000000100a0da90 test_arena_constraints`bool utils::SpinBarrier::customWait<utils::WaitWhileEq, utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onWaitCallback=0x000000017043eb1f, onOpenBarrierCallback=0x000000017043eb47) at spin_barrier.h:140:13
    frame #9: 0x0000000100a0d9ac test_arena_constraints`bool utils::SpinBarrier::wait<utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onOpenBarrierCallback=0x000000017043eb47) at spin_barrier.h:159:16
    frame #10: 0x0000000100a0d978 test_arena_constraints`utils::SpinBarrier::wait(this=0x000000016f412720) at spin_barrier.h:163:16
    frame #11: 0x0000000100a0d950 test_arena_constraints`DOCTEST_ANON_FUNC_38(this=0x000000010110bd58, (null)=0x000000010110bd40)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&)::operator()(tbb::detail::d1::blocked_range<unsigned long> const&) const at test_arena_constraints.cpp:133:33
    frame #12: 0x0000000100a0d388 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::run_body(this=0x000000010110bd00, r=0x000000010110bd40) at parallel_for.h:119:9
    frame #13: 0x0000000100a0c644 test_arena_constraints`void tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x000000010110bd68, start=0x000000010110bd00, range=0x000000010110bd40, ed=0x00000001010d9888)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:447:19
    frame #14: 0x0000000100a0c1b4 test_arena_constraints`void tbb::detail::d1::partition_type_base<tbb::detail::d1::auto_partition_type>::execute<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x000000010110bd68, start=0x000000010110bd00, range=0x000000010110bd40, ed=0x00000001010d9888)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:288:16
    frame #15: 0x0000000100a0be40 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::execute(this=0x000000010110bd00, ed=0x00000001010d9888) at parallel_for.h:172:18
    frame #16: 0x0000000100e8129c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 884
    frame #17: 0x0000000100e77a4c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 80
    frame #18: 0x0000000100e772e8 libtbb_debug.12.6.dylib`tbb::detail::r1::arena::process(tbb::detail::r1::thread_data&) + 484
    frame #19: 0x0000000100e965fc libtbb_debug.12.6.dylib`tbb::detail::r1::market::process(rml::job&) + 100
    frame #20: 0x0000000100e9f014 libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::run() + 120
    frame #21: 0x0000000100e9ef7c libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::thread_routine(void*) + 44
    frame #22: 0x00000001a2709240 libsystem_pthread.dylib`_pthread_start + 148
(lldb) thread select 6
* thread #6
    frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
libsystem_kernel.dylib`__semwait_signal:
->  0x1a26d0ebc <+8>:  b.lo   0x1a26d0edc               ; <+40>
    0x1a26d0ec0 <+12>: pacibsp
    0x1a26d0ec4 <+16>: stp    x29, x30, [sp, #-0x10]!
    0x1a26d0ec8 <+20>: mov    x29, sp
(lldb) bt
* thread #6
  * frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
    frame #1: 0x00000001a25dbd88 libsystem_c.dylib`nanosleep + 216
    frame #2: 0x00000001a2664820 libc++.1.dylib`std::__1::this_thread::sleep_for(std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > const&) + 84
    frame #3: 0x0000000100a0ded8 test_arena_constraints`void std::__1::this_thread::sleep_for<long long, std::__1::ratio<1l, 1000000l> >(__d=0x000000017084a9f0) at thread:386:9
    frame #4: 0x0000000100a0ddd0 test_arena_constraints`void utils::SpinWaitWhile<void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'()>(pred=(anonymous class) @ 0x000000017084aa00) at spin_barrier.h:49:13
    frame #5: 0x0000000100a0dd30 test_arena_constraints`void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(location=0x000000016f412730, comp=(anonymous class) @ 0x000000017084aa38) at spin_barrier.h:60:5
    frame #6: 0x0000000100a0dcf4 test_arena_constraints`void utils::SpinWaitWhileEq<unsigned long, unsigned long>(location=0x000000016f412730, value=0) at spin_barrier.h:67:5
    frame #7: 0x0000000100a0dc10 test_arena_constraints`void utils::WaitWhileEq::operator(this=0x000000017084ab1f, location=0x000000016f412730, value=0)<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long) const at spin_barrier.h:84:9
    frame #8: 0x0000000100a0da90 test_arena_constraints`bool utils::SpinBarrier::customWait<utils::WaitWhileEq, utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onWaitCallback=0x000000017084ab1f, onOpenBarrierCallback=0x000000017084ab47) at spin_barrier.h:140:13
    frame #9: 0x0000000100a0d9ac test_arena_constraints`bool utils::SpinBarrier::wait<utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onOpenBarrierCallback=0x000000017084ab47) at spin_barrier.h:159:16
    frame #10: 0x0000000100a0d978 test_arena_constraints`utils::SpinBarrier::wait(this=0x000000016f412720) at spin_barrier.h:163:16
    frame #11: 0x0000000100a0d950 test_arena_constraints`DOCTEST_ANON_FUNC_38(this=0x0000000101103a58, (null)=0x0000000101103a40)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&)::operator()(tbb::detail::d1::blocked_range<unsigned long> const&) const at test_arena_constraints.cpp:133:33
    frame #12: 0x0000000100a0d388 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::run_body(this=0x0000000101103a00, r=0x0000000101103a40) at parallel_for.h:119:9
    frame #13: 0x0000000100a0c644 test_arena_constraints`void tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x0000000101103a68, start=0x0000000101103a00, range=0x0000000101103a40, ed=0x00000001010d9808)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:447:19
    frame #14: 0x0000000100a0c1b4 test_arena_constraints`void tbb::detail::d1::partition_type_base<tbb::detail::d1::auto_partition_type>::execute<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x0000000101103a68, start=0x0000000101103a00, range=0x0000000101103a40, ed=0x00000001010d9808)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:288:16
    frame #15: 0x0000000100a0be40 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::execute(this=0x0000000101103a00, ed=0x00000001010d9808) at parallel_for.h:172:18
    frame #16: 0x0000000100e8129c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 884
    frame #17: 0x0000000100e77a4c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 80
    frame #18: 0x0000000100e772e8 libtbb_debug.12.6.dylib`tbb::detail::r1::arena::process(tbb::detail::r1::thread_data&) + 484
    frame #19: 0x0000000100e965fc libtbb_debug.12.6.dylib`tbb::detail::r1::market::process(rml::job&) + 100
    frame #20: 0x0000000100e9f014 libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::run() + 120
    frame #21: 0x0000000100e9ef7c libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::thread_routine(void*) + 44
    frame #22: 0x00000001a2709240 libsystem_pthread.dylib`_pthread_start + 148
(lldb) thread select 7
* thread #7
    frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
libsystem_kernel.dylib`__semwait_signal:
->  0x1a26d0ebc <+8>:  b.lo   0x1a26d0edc               ; <+40>
    0x1a26d0ec0 <+12>: pacibsp
    0x1a26d0ec4 <+16>: stp    x29, x30, [sp, #-0x10]!
    0x1a26d0ec8 <+20>: mov    x29, sp
(lldb) bt
* thread #7
  * frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
    frame #1: 0x00000001a25dbd88 libsystem_c.dylib`nanosleep + 216
    frame #2: 0x00000001a2664820 libc++.1.dylib`std::__1::this_thread::sleep_for(std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > const&) + 84
    frame #3: 0x0000000100a0ded8 test_arena_constraints`void std::__1::this_thread::sleep_for<long long, std::__1::ratio<1l, 1000000l> >(__d=0x0000000170c569f0) at thread:386:9
    frame #4: 0x0000000100a0ddd0 test_arena_constraints`void utils::SpinWaitWhile<void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'()>(pred=(anonymous class) @ 0x0000000170c56a00) at spin_barrier.h:49:13
    frame #5: 0x0000000100a0dd30 test_arena_constraints`void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(location=0x000000016f412730, comp=(anonymous class) @ 0x0000000170c56a38) at spin_barrier.h:60:5
    frame #6: 0x0000000100a0dcf4 test_arena_constraints`void utils::SpinWaitWhileEq<unsigned long, unsigned long>(location=0x000000016f412730, value=0) at spin_barrier.h:67:5
    frame #7: 0x0000000100a0dc10 test_arena_constraints`void utils::WaitWhileEq::operator(this=0x0000000170c56b1f, location=0x000000016f412730, value=0)<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long) const at spin_barrier.h:84:9
    frame #8: 0x0000000100a0da90 test_arena_constraints`bool utils::SpinBarrier::customWait<utils::WaitWhileEq, utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onWaitCallback=0x0000000170c56b1f, onOpenBarrierCallback=0x0000000170c56b47) at spin_barrier.h:140:13
    frame #9: 0x0000000100a0d9ac test_arena_constraints`bool utils::SpinBarrier::wait<utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onOpenBarrierCallback=0x0000000170c56b47) at spin_barrier.h:159:16
    frame #10: 0x0000000100a0d978 test_arena_constraints`utils::SpinBarrier::wait(this=0x000000016f412720) at spin_barrier.h:163:16
    frame #11: 0x0000000100a0d950 test_arena_constraints`DOCTEST_ANON_FUNC_38(this=0x000000010115bc58, (null)=0x000000010115bc40)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&)::operator()(tbb::detail::d1::blocked_range<unsigned long> const&) const at test_arena_constraints.cpp:133:33
    frame #12: 0x0000000100a0d388 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::run_body(this=0x000000010115bc00, r=0x000000010115bc40) at parallel_for.h:119:9
    frame #13: 0x0000000100a0c644 test_arena_constraints`void tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x000000010115bc68, start=0x000000010115bc00, range=0x000000010115bc40, ed=0x00000001010d9908)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:447:19
    frame #14: 0x0000000100a0c1b4 test_arena_constraints`void tbb::detail::d1::partition_type_base<tbb::detail::d1::auto_partition_type>::execute<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x000000010115bc68, start=0x000000010115bc00, range=0x000000010115bc40, ed=0x00000001010d9908)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:288:16
    frame #15: 0x0000000100a0be40 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::execute(this=0x000000010115bc00, ed=0x00000001010d9908) at parallel_for.h:172:18
    frame #16: 0x0000000100e8129c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 884
    frame #17: 0x0000000100e77a4c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 80
    frame #18: 0x0000000100e772e8 libtbb_debug.12.6.dylib`tbb::detail::r1::arena::process(tbb::detail::r1::thread_data&) + 484
    frame #19: 0x0000000100e965fc libtbb_debug.12.6.dylib`tbb::detail::r1::market::process(rml::job&) + 100
    frame #20: 0x0000000100e9f014 libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::run() + 120
    frame #21: 0x0000000100e9ef7c libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::thread_routine(void*) + 44
    frame #22: 0x00000001a2709240 libsystem_pthread.dylib`_pthread_start + 148
(lldb) thread select 8
* thread #8
    frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
libsystem_kernel.dylib`__semwait_signal:
->  0x1a26d0ebc <+8>:  b.lo   0x1a26d0edc               ; <+40>
    0x1a26d0ec0 <+12>: pacibsp
    0x1a26d0ec4 <+16>: stp    x29, x30, [sp, #-0x10]!
    0x1a26d0ec8 <+20>: mov    x29, sp
(lldb) bt
* thread #8
  * frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
    frame #1: 0x00000001a25dbd88 libsystem_c.dylib`nanosleep + 216
    frame #2: 0x00000001a2664820 libc++.1.dylib`std::__1::this_thread::sleep_for(std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > const&) + 84
    frame #3: 0x0000000100a0ded8 test_arena_constraints`void std::__1::this_thread::sleep_for<long long, std::__1::ratio<1l, 1000000l> >(__d=0x00000001710629f0) at thread:386:9
    frame #4: 0x0000000100a0ddd0 test_arena_constraints`void utils::SpinWaitWhile<void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'()>(pred=(anonymous class) @ 0x0000000171062a00) at spin_barrier.h:49:13
    frame #5: 0x0000000100a0dd30 test_arena_constraints`void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(location=0x000000016f412730, comp=(anonymous class) @ 0x0000000171062a38) at spin_barrier.h:60:5
    frame #6: 0x0000000100a0dcf4 test_arena_constraints`void utils::SpinWaitWhileEq<unsigned long, unsigned long>(location=0x000000016f412730, value=0) at spin_barrier.h:67:5
    frame #7: 0x0000000100a0dc10 test_arena_constraints`void utils::WaitWhileEq::operator(this=0x0000000171062b1f, location=0x000000016f412730, value=0)<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long) const at spin_barrier.h:84:9
    frame #8: 0x0000000100a0da90 test_arena_constraints`bool utils::SpinBarrier::customWait<utils::WaitWhileEq, utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onWaitCallback=0x0000000171062b1f, onOpenBarrierCallback=0x0000000171062b47) at spin_barrier.h:140:13
    frame #9: 0x0000000100a0d9ac test_arena_constraints`bool utils::SpinBarrier::wait<utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onOpenBarrierCallback=0x0000000171062b47) at spin_barrier.h:159:16
    frame #10: 0x0000000100a0d978 test_arena_constraints`utils::SpinBarrier::wait(this=0x000000016f412720) at spin_barrier.h:163:16
    frame #11: 0x0000000100a0d950 test_arena_constraints`DOCTEST_ANON_FUNC_38(this=0x0000000101143a58, (null)=0x0000000101143a40)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&)::operator()(tbb::detail::d1::blocked_range<unsigned long> const&) const at test_arena_constraints.cpp:133:33
    frame #12: 0x0000000100a0d388 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::run_body(this=0x0000000101143a00, r=0x0000000101143a40) at parallel_for.h:119:9
    frame #13: 0x0000000100a0c644 test_arena_constraints`void tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x0000000101143a68, start=0x0000000101143a00, range=0x0000000101143a40, ed=0x00000001010d9a08)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:447:19
    frame #14: 0x0000000100a0c1b4 test_arena_constraints`void tbb::detail::d1::partition_type_base<tbb::detail::d1::auto_partition_type>::execute<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x0000000101143a68, start=0x0000000101143a00, range=0x0000000101143a40, ed=0x00000001010d9a08)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:288:16
    frame #15: 0x0000000100a0be40 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::execute(this=0x0000000101143a00, ed=0x00000001010d9a08) at parallel_for.h:172:18
    frame #16: 0x0000000100e8129c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 884
    frame #17: 0x0000000100e77a4c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 80
    frame #18: 0x0000000100e772e8 libtbb_debug.12.6.dylib`tbb::detail::r1::arena::process(tbb::detail::r1::thread_data&) + 484
    frame #19: 0x0000000100e965fc libtbb_debug.12.6.dylib`tbb::detail::r1::market::process(rml::job&) + 100
    frame #20: 0x0000000100e9f014 libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::run() + 120
    frame #21: 0x0000000100e9ef7c libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::thread_routine(void*) + 44
    frame #22: 0x00000001a2709240 libsystem_pthread.dylib`_pthread_start + 148
(lldb) thread select 9
* thread #9
    frame #0: 0x00000001a26cd990 libsystem_kernel.dylib`semaphore_wait_trap + 8
libsystem_kernel.dylib`semaphore_wait_trap:
->  0x1a26cd990 <+8>: ret

libsystem_kernel.dylib`semaphore_wait_signal_trap:
    0x1a26cd994 <+0>: mov    x16, #-0x25
    0x1a26cd998 <+4>: svc    #0x80
    0x1a26cd99c <+8>: ret
(lldb) bt
* thread #9
  * frame #0: 0x00000001a26cd990 libsystem_kernel.dylib`semaphore_wait_trap + 8
    frame #1: 0x0000000100e742f4 libtbb_debug.12.6.dylib`tbb::detail::r1::binary_semaphore::P() + 32
    frame #2: 0x0000000100e9f69c libtbb_debug.12.6.dylib`tbb::detail::r1::rml::internal::thread_monitor::commit_wait(tbb::detail::r1::rml::internal::thread_monitor::cookie&) + 88
    frame #3: 0x0000000100e9f06c libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::run() + 208
    frame #4: 0x0000000100e9ef7c libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::thread_routine(void*) + 44
    frame #5: 0x00000001a2709240 libsystem_pthread.dylib`_pthread_start + 148
(lldb) thread select 10
* thread #10
    frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
libsystem_kernel.dylib`__semwait_signal:
->  0x1a26d0ebc <+8>:  b.lo   0x1a26d0edc               ; <+40>
    0x1a26d0ec0 <+12>: pacibsp
    0x1a26d0ec4 <+16>: stp    x29, x30, [sp, #-0x10]!
    0x1a26d0ec8 <+20>: mov    x29, sp
(lldb) bt
* thread #10
  * frame #0: 0x00000001a26d0ebc libsystem_kernel.dylib`__semwait_signal + 8
    frame #1: 0x00000001a25dbd88 libsystem_c.dylib`nanosleep + 216
    frame #2: 0x00000001a2664820 libc++.1.dylib`std::__1::this_thread::sleep_for(std::__1::chrono::duration<long long, std::__1::ratio<1l, 1000000000l> > const&) + 84
    frame #3: 0x0000000100a0ded8 test_arena_constraints`void std::__1::this_thread::sleep_for<long long, std::__1::ratio<1l, 1000000l> >(__d=0x000000017187a9f0) at thread:386:9
    frame #4: 0x0000000100a0ddd0 test_arena_constraints`void utils::SpinWaitWhile<void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'()>(pred=(anonymous class) @ 0x000000017187aa00) at spin_barrier.h:49:13
    frame #5: 0x0000000100a0dd30 test_arena_constraints`void utils::SpinWaitWhileCondition<unsigned long, void utils::SpinWaitWhileEq<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long)::'lambda'(unsigned long)>(location=0x000000016f412730, comp=(anonymous class) @ 0x000000017187aa38) at spin_barrier.h:60:5
    frame #6: 0x0000000100a0dcf4 test_arena_constraints`void utils::SpinWaitWhileEq<unsigned long, unsigned long>(location=0x000000016f412730, value=0) at spin_barrier.h:67:5
    frame #7: 0x0000000100a0dc10 test_arena_constraints`void utils::WaitWhileEq::operator(this=0x000000017187ab1f, location=0x000000016f412730, value=0)<unsigned long, unsigned long>(std::__1::atomic<unsigned long> const&, unsigned long) const at spin_barrier.h:84:9
    frame #8: 0x0000000100a0da90 test_arena_constraints`bool utils::SpinBarrier::customWait<utils::WaitWhileEq, utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onWaitCallback=0x000000017187ab1f, onOpenBarrierCallback=0x000000017187ab47) at spin_barrier.h:140:13
    frame #9: 0x0000000100a0d9ac test_arena_constraints`bool utils::SpinBarrier::wait<utils::SpinBarrier::DummyCallback>(this=0x000000016f412720, onOpenBarrierCallback=0x000000017187ab47) at spin_barrier.h:159:16
    frame #10: 0x0000000100a0d978 test_arena_constraints`utils::SpinBarrier::wait(this=0x000000016f412720) at spin_barrier.h:163:16
    frame #11: 0x0000000100a0d950 test_arena_constraints`DOCTEST_ANON_FUNC_38(this=0x000000010115ba58, (null)=0x000000010115ba40)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&)::operator()(tbb::detail::d1::blocked_range<unsigned long> const&) const at test_arena_constraints.cpp:133:33
    frame #12: 0x0000000100a0d388 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::run_body(this=0x000000010115ba00, r=0x000000010115ba40) at parallel_for.h:119:9
    frame #13: 0x0000000100a0c644 test_arena_constraints`void tbb::detail::d1::dynamic_grainsize_mode<tbb::detail::d1::adaptive_mode<tbb::detail::d1::auto_partition_type> >::work_balance<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x000000010115ba68, start=0x000000010115ba00, range=0x000000010115ba40, ed=0x00000001010d9b08)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:447:19
    frame #14: 0x0000000100a0c1b4 test_arena_constraints`void tbb::detail::d1::partition_type_base<tbb::detail::d1::auto_partition_type>::execute<tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38(this=0x000000010115ba68, start=0x000000010115ba00, range=0x000000010115ba40, ed=0x00000001010d9b08)::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>, tbb::detail::d1::blocked_range<unsigned long> >(tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>&, tbb::detail::d1::blocked_range<unsigned long>&, tbb::detail::d1::execution_data&) at partitioner.h:288:16
    frame #15: 0x0000000100a0be40 test_arena_constraints`tbb::detail::d1::start_for<tbb::detail::d1::blocked_range<unsigned long>, DOCTEST_ANON_FUNC_38()::$_6::operator()() const::'lambda'(tbb::detail::d1::blocked_range<unsigned long> const&), tbb::detail::d1::auto_partitioner const>::execute(this=0x000000010115ba00, ed=0x00000001010d9b08) at parallel_for.h:172:18
    frame #16: 0x0000000100e8129c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<false, tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 884
    frame #17: 0x0000000100e77a4c libtbb_debug.12.6.dylib`tbb::detail::d1::task* tbb::detail::r1::task_dispatcher::local_wait_for_all<tbb::detail::r1::outermost_worker_waiter>(tbb::detail::d1::task*, tbb::detail::r1::outermost_worker_waiter&) + 80
    frame #18: 0x0000000100e772e8 libtbb_debug.12.6.dylib`tbb::detail::r1::arena::process(tbb::detail::r1::thread_data&) + 484
    frame #19: 0x0000000100e965fc libtbb_debug.12.6.dylib`tbb::detail::r1::market::process(rml::job&) + 100
    frame #20: 0x0000000100e9f014 libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::run() + 120
    frame #21: 0x0000000100e9ef7c libtbb_debug.12.6.dylib`tbb::detail::r1::rml::private_worker::thread_routine(void*) + 44
    frame #22: 0x00000001a2709240 libsystem_pthread.dylib`_pthread_start + 148
(lldb)
@alexey-katranov
Copy link
Contributor

It seems we have two issues:

  1. Broken serialization of worker thread requests (more threads than system concurrency are created)
  2. Broken barrier lifetime (some threads from previous epochs are still waiting)

@phprus
Copy link
Contributor Author

phprus commented Jan 28, 2022

Broken serialization of worker thread requests (more threads than system concurrency are created)

M1 Max has 10 cores (2 efficient and 8 performance cores)
The test creates 10 threads.

@alexey-katranov
Copy link
Contributor

M1 Max has 10 cores (2 efficient and 8 performance cores)

Thank you for the info, I was thinking about 8 cores, so the assumption above is not correct.

@phprus
Copy link
Contributor Author

phprus commented Feb 16, 2022

Any news?

@alexey-katranov
Copy link
Contributor

It seems the testing approach is broken on ARM, any test using utils::SpinBarrier might hang without real issue. We are thinking about better approach for testing.

@phprus
Copy link
Contributor Author

phprus commented Jun 7, 2022

Is there any news on this issue?

@phprus
Copy link
Contributor Author

phprus commented Sep 7, 2022

@alexey-katranov
This is not utils::SpinBarrier issue on ARM.

I changed the test_arena_constraints test:

TEST_CASE("Test memory leaks") {
constexpr size_t num_trials = 1000;
// To reduce the test session time only one constraints object is used inside this test.
// This constraints should use all available settings to cover the most part of tbbbind functionality.
auto constraints = tbb::task_arena::constraints{}
.set_numa_id(tbb::info::numa_nodes().front())
.set_core_type(tbb::info::core_types().front())
.set_max_threads_per_core(1);
size_t current_memory_usage = 0, previous_memory_usage = 0, stability_counter = 0;
bool no_memory_leak = false;
for (size_t i = 0; i < num_trials; i++) {
{ /* All DTORs must be called before GetMemoryUsage() call*/
tbb::task_arena arena{constraints};
arena.execute([]{
utils::SpinBarrier barrier;
barrier.initialize(tbb::this_task_arena::max_concurrency());
tbb::parallel_for(
tbb::blocked_range<size_t>(0, tbb::this_task_arena::max_concurrency()),
[&barrier](const tbb::blocked_range<size_t>&) {
barrier.wait();
}
);
});
}
current_memory_usage = utils::GetMemoryUsage();
stability_counter = current_memory_usage==previous_memory_usage ? stability_counter + 1 : 0;
// If the amount of used memory has not changed during 5% of executions,
// then we can assume that the check was successful
if (stability_counter > num_trials / 20) {
no_memory_leak = true;
break;
}
previous_memory_usage = current_memory_usage;
}
REQUIRE_MESSAGE(no_memory_leak, "Seems we get memory leak here.");
}

  1. Replace utils::SpinBarrier with C++20 std::barrier (https://en.cppreference.com/w/cpp/thread/barrier)
  2. I have added output of one character from each thread with lock.
    Full code:
TEST_CASE("Test memory leaks") {
    constexpr size_t num_trials = 1000;

    // To reduce the test session time only one constraints object is used inside this test.
    // This constraints should use all available settings to cover the most part of tbbbind functionality.
    auto constraints = tbb::task_arena::constraints{}
        .set_numa_id(tbb::info::numa_nodes().front())
        .set_core_type(tbb::info::core_types().front())
        .set_max_threads_per_core(1);

    std::mutex m;
    size_t current_memory_usage = 0, previous_memory_usage = 0, stability_counter = 0;
    bool no_memory_leak = false;
    for (size_t i = 0; i < num_trials; i++) {
        { /* All DTORs must be called before GetMemoryUsage() call*/
            tbb::task_arena arena{constraints};
            arena.execute([&m]{
                // ---
                auto max_concurrency = tbb::this_task_arena::max_concurrency();
                std::cerr << std::endl << std::endl << max_concurrency << std::endl << std::endl;

                std::barrier barrier(tbb::this_task_arena::max_concurrency());
                // utils::SpinBarrier barrier;
                // barrier.initialize(tbb::this_task_arena::max_concurrency());
                // ---
                tbb::parallel_for(
                    tbb::blocked_range<size_t>(0, tbb::this_task_arena::max_concurrency()),
                    [&barrier, &m](const tbb::blocked_range<size_t>&r) {
                        // ----
                        auto s = r.end()-r.begin();
                        (void)m;
//                        m.lock();
                        if (s != 1) {
                            std::cerr << "\nInvalid chunk!!!\n";
                            abort();
                        }
//                        std::cerr << r.begin() << "|" << std::flush;
                        std::cerr << r.begin() << std::endl;
//                        m.unlock();
                        barrier.arrive_and_wait();
                        // barrier.wait();
                        // ----
                    }
                );
                // ----
                std::cerr << "END" << std::endl;
                // ----
            });
        }

        current_memory_usage = utils::GetMemoryUsage();
        stability_counter = current_memory_usage==previous_memory_usage ? stability_counter + 1 : 0;
        // If the amount of used memory has not changed during 5% of executions,
        // then we can assume that the check was successful
        if (stability_counter > num_trials / 20) {
            no_memory_leak = true;
            break;
        }
        previous_memory_usage = current_memory_usage;
    }
    REQUIRE_MESSAGE(no_memory_leak, "Seems we get memory leak here.");
}

Run test:

ctest --timeout 18 --output-on-failure -R test_arena_constraints --repeat-until-fail 200

Output 1:

    Test #68: test_arena_constraints ...........***Timeout  18.04 sec
oneTBB: SPECIFICATION VERSION	1.0
oneTBB: VERSION		2021.8
oneTBB: INTERFACE VERSION	12080
oneTBB: TBB_USE_DEBUG	0
oneTBB: TBB_USE_ASSERT	0
oneTBB: ALLOCATOR	scalable_malloc
oneTBB: TOOLS SUPPORT	disabled
oneTBB: TBBBIND	UNAVAILABLE
TBBmalloc: SPECIFICATION VERSION	1.0
TBBmalloc: VERSION		2021.8
TBBmalloc: INTERFACE VERSION	12080
TBBmalloc: TBB_USE_DEBUG	0
TBBmalloc: TBB_USE_ASSERT	0
TBBmalloc: huge pages	not requested
[doctest] doctest version is "2.4.7"
[doctest] run with "--help" for options


10

0
5
1
6
7
9
8
2
3
4
END


10

0
5
7
16

8
2
9
3


0% tests passed, 1 tests failed out of 1

First trial - all threads are called (10). Second trial - the last thread is not called (only 9).

Output 2:

    Test #68: test_arena_constraints ...........***Timeout  18.04 sec
oneTBB: SPECIFICATION VERSION	1.0
oneTBB: VERSION		2021.8
oneTBB: INTERFACE VERSION	12080
oneTBB: TBB_USE_DEBUG	0
oneTBB: TBB_USE_ASSERT	0
oneTBB: ALLOCATOR	scalable_malloc
oneTBB: TOOLS SUPPORT	disabled
oneTBB: TBBBIND	UNAVAILABLE
TBBmalloc: SPECIFICATION VERSION	1.0
TBBmalloc: VERSION		2021.8
TBBmalloc: INTERFACE VERSION	12080
TBBmalloc: TBB_USE_DEBUG	0
TBBmalloc: TBB_USE_ASSERT	0
TBBmalloc: huge pages	not requested
[doctest] doctest version is "2.4.7"
[doctest] run with "--help" for options


10

0
5
1
7
6
8
2
9
3
4
END


10

0
215
468

3


7
9

END


10

0
5
7
168924



3


END


10

0
5
2
3
41
6
9
8

7
END


10

0
5
7
2149



8
3
6
END


10

0
5
2
16
3
9847




END


10

0
5
738162



49



END


10

0
5
1
3
7
24
9
6

8
END


10

0
5
7
69
312


8

4
END


10

0
513
84
76
9




2
END


10

0
5
7
2
1
8
9
3
4
6
END


10

0
5
7
2
1
6
84

93

END


10

0
5
7
2
6
3
9
481


END


10

0
5
2
34
96
1


8
7
END


10

0
578
63
2

4
1
9


END


10

0
5
21

87

9
6
3
4
END


10

0
5
327
1
8


4
6
9
END


10

0
5
7
386
9

1

24

END


10

0
583492

6

1


7

END


10

0
58
62

7941

3



END


10

0
537

249
6
8


1

END


10

0
5
7
6
2
891
3


4
END


10

0
5
237
149

68




END


10

0
5
284
36
79

1



END


10

0
5
2
3184
6
79




END


10

0
5
26

978

3
41


END


10

0
5
2
16

74
8
9

3
END


10

0
5
7
2
6
1
4
3
8
9
END


10

0
5
2
1
8463

79



END


10

0
5
7134


9
8
2

6
END


10

0
5
7963
28
41





END


10

0
5
7
628

9
1

4
3
END


10

0
5
3
74289
1



6

END


10

0
2
7
1894



6
3
5
END


10

0
5
7
6
2
8
39
1

4
END


10

0
5
7
698
3

2
1

4
END


10

0
53
71
298

4


6

END


10

0
5
2
6874
3
19




END


10

0
57891


3
2

64


END


10

0
5
7
8
9
2
1
6
3


0% tests passed, 1 tests failed out of 1

Last trial - the last thread is not called (only 9 digits is printed).

@phprus
Copy link
Contributor Author

phprus commented Sep 7, 2022

cc @kboyarinov, @pavelkumbrasev

@pavelkumbrasev
Copy link
Contributor

Hi @phprus, the problem that Alex did not write all details that were behind the scene.
It was several month ago (so I might be mistaken in this case), there is a root cause:
They way oneTBB share the tasks across threads is some sort of weak ordering. I am talking about the task spawn and signal propagation inside the internal arena.
And while it is ok on arches like x86 it might lead to problems (for example hangs) on the weaker memory models (Apple M1 for example).
Also it will reproduce vary rarely and only I think in a cases when we use for example barriers (It is no matter which barrier oneTBB one or standard one).
And as Alex mentioned: "We are thinking about better approach for testing." we did not come to any results yet.

@phprus
Copy link
Contributor Author

phprus commented Sep 7, 2022

@pavelkumbrasev
Thanks for your reply!

On ARM this error occurs very frequently. On average, for a test to fail, it needs to be run less than 50 times.

I'm think the issue #712 might have the same reasons. And this issue is reproduced on x86.

In addition, for the test_collaborative_call_once test, I found a configuration to 100% hangs.
See my comment #712 (comment) and commit phprus@eeb0154 with new CI config. On this config (with -DCMAKE_INTERPROCEDURAL_OPTIMIZATION=ON) test test_collaborative_call_once is 100% hangs.

@pavelkumbrasev
Copy link
Contributor

pavelkumbrasev commented Sep 8, 2022

This problem generally similar to reduced example:

std::atomic<int> counter{0};
tbb::task_group g;
g.run([&] { ++counter; });
while (counter == 0);

This example might hang on any system because oneTBB doesn't guaranteed parallelism and g.wait() should be called.
And this is key part of oneTBB design ("weak semantic" on signal propagation) this part is should be critical for performance.
However, I hope we will discuss and investigate it more.

@pavelkumbrasev
Copy link
Contributor

Also @isaevil could you please look at #712 and try to confirm that this is similar problem?

@phprus
Copy link
Contributor Author

phprus commented Sep 8, 2022

@pavelkumbrasev
Is the assumption that tbb::parallel_for will be executed in tbb::this_task_arena::max_concurrency() threads a wrong?
The problem is not in utils::SpinBarrier, but in the fact that such a barrier cannot be written for tbb::this_task_arena::max_concurrency() threads, because the real number of threads is not known?

@pavelkumbrasev
Copy link
Contributor

the real number of threads is not known?
oneTBB does not guarantee that into parallel_for will be any thread except threads that started this parallel_for.
But we often use utils::SpinBarrier in our tests with assumption that threads will come into arena with almost 100%.

As you can see sometimes it is not true.

@phprus
Copy link
Contributor Author

phprus commented Sep 8, 2022

@pavelkumbrasev @isaevil
Fix for failed test 137 - test_malloc_overload_disable (Failed) (CI commit phprus@eeb0154):
PR #870. Review it please.

@isaevil
Copy link
Contributor

isaevil commented Sep 8, 2022

@pavelkumbrasev @phprus test_collaborative_call_once and conformance_collaborative_call_once also have test cases that use barrier inside TBB parallel constructions in order to check the correctness of the algorithm and to stress test it. Based on the traces @phprus gave at #712 for hanging tests, looks like this is similar problem.

@phprus
Copy link
Contributor Author

phprus commented Sep 8, 2022

@isaevil
Thanks for your research!
If it is a similar problem, then it is reproducible on x86 (on bare metal and in github actions).

@nofuturre
Copy link

@phprus is this issue still relevant?

@phprus
Copy link
Contributor Author

phprus commented Jul 10, 2024

@nofuturre
Yes, sporadically hangs on ARM is relevant issue.
And on x86_64 too (#1281)

@pavelkumbrasev
Copy link
Contributor

Hi @phprus, I created a PR to collect a contribution ideas: #1411
Do you want to put this as possible contribution perhaps as advance difficulty.
I'm not sure if we find to work on this problem soon so contribution can help with it :)

@pavelkumbrasev
Copy link
Contributor

By the way, if you have any other ideas you are welcomed to put them into PR.

@phprus
Copy link
Contributor Author

phprus commented Jul 12, 2024

@pavelkumbrasev

I tried researching for the cause of this problem and found a hang on x86_64 (#1281). And here my understanding is no longer enough to solve it.

Do you have any plans to fix x86_64 bug #1281?

@pavelkumbrasev
Copy link
Contributor

Do you have any plans to fix x86_64 bug #1281?

This PR should fix it - #1436

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants