Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[NativeAOT] Cleanup and rationalizing process/thread termination scenario #80063

Merged
merged 11 commits into from
Dec 30, 2022

Conversation

VSadov
Copy link
Member

@VSadov VSadov commented Dec 30, 2022

It was noticed a while ago that while the current scheme works, it has some ad-hoc parts that may not be robust enough for broader set of scenarios such as terminating the process from multiple threads or by calling ExitProcess from native code, or in a context of a .dll

This PR tries to take care of that debt.
Fixes: #74711

  • unified CrstSuspendEE and threadstore lock into one CrstThreadStore
  • the locks had largerly overlapping purpose and one lock is better than two.
  • 3 stage thread detach
  1. call managed pre-mortem callbacks while still safe. (especially important on Unix as we need to release/signal wait subsystem locks)
  2. remove ability to run managed code (this stage takes threadstore lock for thread removal and to make sure that GC cannot happen)
  3. perform remaining cleanup of native data structures.
  • rationalized thread shutdown on Windows vs. non-Windows
  • on Windows the thread that initiates and performs the process shutdown, by calling native ExitProcess, processing CTRL-C or in few other ways is special. Blocking it will prevent process exiting. We detect this thread via DLL_PROCESS_DETACH (.dll) or atexit() (.exe) and remember to not detach this thread. We do not want to run managed code or take threadstore lock and risk blocking this thread when process may already be in inconsitent state.
  • other, not special, threads are allowed to do regular detach sequence. At process shutdown, if timing is tight, some such threads may be destroyed rudely or get into deadlocks. Not much can be done about that, but we do not care at that point.
  • on Unix there are no such special threads.
  • the purpose of PalAttachThread is to register with OS thread premortem notification. PalDetachThread just unregisters.
    We use:
  • fiber detach notification (Windows)
  • pthread_setspecific (Linux, Android)
  • tls destructor (default)

In either case the notification comes without loader/OS locks held, so detaching is safe.
The only case where loader lock could be held is shutting down the special thread on Windows. One more reason to not detach that one.

  • addressed some TODOs and obvious dead code.

@ghost
Copy link

ghost commented Dec 30, 2022

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

It was noticed a while ago that while the current scheme works, it has some ad-hoc parts that may not be robust enough for broader set of scenarios such as terminating the process from multiple threads or by calling ExitProcess from native code, or in a context of a .dll

This PR tries to take care of that debt.
Fixes: #74711

  • unified CrstSuspendEE and threadstore lock into one CrstThreadStore

  • the locks had largerly overlapping purpose and one lock is better than two.

  • 3 stage thread detach

  1. call managed pre-mortem callbacks while still safe. (especially important on Unix as we need to release or signal wait subsystem locks)
  2. remove ability to run managed code (this takes threadstore lock for thread removal and to make sure that GC cannot happen)
  3. perform native cleanup.
  • rationalized thread shutdown on Windows vs. non-Windows

  • on Windows the thread that initiates and performs the process shutdown, by calling native ExitProcess, processing CTRL-C or in few other ways is special. Blocking it will prevent process exiting. We detect this thread via DLL_PROCESS_DETACH (.dll) or atexit() (.exe) and remember to not detach this thread. We do not want to run managed code or take threadstore lock and risk blocking this thread when process may already be in inconsitent state.

  • other, not special, threads are allowed to do regular detach sequence. At process shutdown, if timing is tight, some such threads may be destroyed rudely or get into deadlocks. Not much can be done about that, but we do not care at that point.

  • on Unix there are no such special threads.

  • the purpose of PalAttachThread is to register with OS thread premortem notification. PalDetachThread just unregisters.
    We use:

  • fiber detach notification (Windows)

  • pthread_setspecific (Linux, Android)

  • tls destructor (default)

In either case the notification comes without loader/OS locks held, so detaching is safe.
The only case where loader lock could be held is shutting down the special thread on Windows. One more reason to not detach that one.

  • addressed some TODOs and obvious dead code.
Author: VSadov
Assignees: VSadov
Labels:

area-NativeAOT-coreclr

Milestone: -

@jkotas
Copy link
Member

jkotas commented Dec 30, 2022

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

// Indicate that runtime shutdown is complete and that the caller is about to start shutting down the entire process.
g_threadPerformingShutdown = ThreadStore::RawGetCurrentThread();
}

#ifdef _WIN32
EXTERN_C UInt32_BOOL WINAPI RtuDllMain(HANDLE hPalInstance, uint32_t dwReason, void* pvReserved)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not see any place where this is used. Delete it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed RtuDllMain is used somehow when a NativeAOT dll is created. It is not used as-is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think it is used anywhere. It looks like a left-over from redhawk.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like, at least in a case of .dll publishing, one would need to hookup InitDLL/UninitDLL, and on windows OnProcessExit.

Is that handled somehow or dynamic NativeAOT libraries are not supported?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we have a static lib sample, but no dlls.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we have a static lib sample, but no dlls.

The DLL sample is in the same location: https://github.com/dotnet/samples/tree/main/core/nativeaot/NativeLibrary#building-shared-libraries

Building DLLs is the only officially documented thing. Static lib is just the sample: https://learn.microsoft.com/en-us/dotnet/core/deploying/native-aot/#build-native-libraries

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I missed that, how does the .dll initialize the runtime? Calls RtuDllMain somehow, or calls RhInitialize on the first reverse pinvoke?
The latter should work (and perhaps more reliably) with this change.
Do we have a test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we initialize on first reverse p/invoke. The test is here: https://github.com/dotnet/runtime/tree/main/src/tests/nativeaot/SmokeTests/SharedLibrary

@jkotas
Copy link
Member

jkotas commented Dec 30, 2022

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

@jkotas
Copy link
Member

jkotas commented Dec 30, 2022

/azp run runtime-extra-platforms

@azure-pipelines
Copy link

Azure Pipelines successfully started running 1 pipeline(s).

Copy link
Member

@jkotas jkotas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once the runtime-extra-platforms tests pass.

@VSadov
Copy link
Member Author

VSadov commented Dec 30, 2022

Build coreclr Pri0 Runtime Tests Run windows x86 checked failure looks like #75244

@VSadov
Copy link
Member Author

VSadov commented Dec 30, 2022

There are some failures in Mono tests, but I do not think this PR shares anything with Mono. I see similar failures in #80035
NativeAOT tests have all passed.

@VSadov
Copy link
Member Author

VSadov commented Dec 30, 2022

Thanks!!

@VSadov VSadov merged commit 92e8eab into dotnet:main Dec 30, 2022
@VSadov VSadov deleted the threading1 branch December 30, 2022 23:24
@ghost ghost locked as resolved and limited conversation to collaborators Feb 1, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[NativeAOT] Revisit process/thread termination scenario
3 participants