Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Implement RealtimeSanitizer (RTSan) support, add #[realtime(nonblocking)], #[realtime(blocking)] attributes #3766

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

cjappl
Copy link

@cjappl cjappl commented Jan 30, 2025

RealtimeSanitizer is an approach to detecting real-time safety violations in timing critical code. It has been added to LLVM 20, adn this RFC aims to leverage that work and add it to Rust.

See the document for more.

Thanks for considering,
Chris (@cjappl), David (@davidtrevelyan), and Stephen (@steckes)

Rendered

@programmerjake
Copy link
Member

programmerjake commented Jan 30, 2025

one objection to naming the attributes #[blocking]/#[nonblocking] is those are ambiguous since different people have different ideas of what to consider blocking or not based on which domain they're working in, e.g. for async network code reading a file from disk, writing to stdout/stderr, or allocating memory are often considered non-blocking. I'm mentioning this because this kind of annotation and checking has often been mentioned as something wanted for async.
rust-lang/wg-async#19

@ehuss ehuss added the T-lang Relevant to the language team, which will review and decide on the RFC. label Jan 31, 2025
@clarfonthey
Copy link

I also agree that the naming is confusing, since while it is very similar to the goals of what async people want, I think that it's not quite the same.

Perhaps simply having a realtime attribute and having realtime(required) (must be executable in realtime context) and realtime(forbidden) (not allowed in realtime context) would be better, since it indicates what the attributes are actually doing.

@cjappl
Copy link
Author

cjappl commented Jan 31, 2025

I think that seems reasonable. We are open to changing the attribute names. I like the sound of "realtime(required)" and "realtime(forbidden)"

Some other possibilities (spitballing):

For nonblocking:

  • realtime
  • realtime(required)
  • realtime(constrained)
  • rtsan(realtime)
  • rtsan(nonblocking) or realtime(nonblocking) - if we cared about parity with clang

For blocking:

  • realtime(unsafe) - unsafe may be too overloaded in rust for this one to fly
  • realtime(forbidden)
  • rtsan(blocking)

Any of these jump out to folks? Not sure if we want to do "rtsan" or "realtime" as the main attribute name, I like "realtime" better but it is less specific, for better or worse.

```rust
#[nonblocking]
fn process() {
rtsan_scoped_disabler!({
Copy link
Contributor

@madsmtm madsmtm Jan 31, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not familiar with the implementation of RealtimeSanitizer, but would it be possible to implement this kind of thing / sanitizer/rtsan_interface.h in a crate instead? I recall that something like that is possible with Callgrind.

If that's the case, you could perhaps downscope this to just "add RealtimeSanitizer and std support", and let users figure it out for themselves to do something else on cfg!(sanitize = "realtime") if they have some API themselves that they want to mark as realtime.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand, could you elaborate more?

As for internal details, let me give you some more info to help understand the inner workings of rtsan.

The main component is the sanitizer runtime library. This is a simple dynamic library, similar to the runtime components of tsan, asan etc. Like any shared library, if you have the header and the compiled library you can include it in your binary.

So, it would be possible to include the header, and link against the shared library. In fact, we made a standalone proof of concept crate to work out many of the kinks of rtsan in rust and write this proposal, that can be seen here:

https://github.com/realtime-sanitizer/rtsan-standalone-rs

With an article from @steckes on his blog on how he implemented and lessons he learned (not necessary to read for this RFC, but may provide additional info): https://steck.tech/posts/rtsan-in-rust/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, assuming that std can get special treatment and thus we don't need to worry about that for the design of the feature / the public API, the main part is that rustc is going to link in this shared libclang_rt.rtsan* library when you enable -Zsantizer=realtime (or -Csantizer=realtime), right?

If that is done, could the rtsan-standalone-rs crate be updated such that e.g. realtime_enter would be defined as the following?

pub fn realtime_enter() {
    #[cfg(all(sanitize = "realtime", any(target_os = "macos", target_os = "linux")))]
    unsafe { rtsan_standalone_sys::__rtsan_realtime_enter() };
}

Along with rtsan-standalone-sys being updated to not do anything in build.rs (because rustc already does it for it)?

(Remember that the sanitize = "..." cfg is already provided for the other sanitizers).

If so, then I don't see the reason why these attributes has to be exposed by rustc as a language feature?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My argument here is not so much against a built-in #[blocking] etc., it's just that, if we can avoid having this in the language, and restrict it to just adding a rustc flag, then there's a much higher probability that the feature will be accepted (actually, then I don't even think you'd need an RFC, at most an MCP. But I'm not a compiler team member, so I can't say that for sure).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, assuming that std can get special treatment and thus we don't need to worry about that for the design of the feature / the public API, the main part is that rustc is going to link in this shared libclang_rt.rtsan* library when you enable -Zsantizer=realtime (or -Csantizer=realtime), right?

Yes, you're correct in your understanding here!

If that is done, could the rtsan-standalone-rs crate be updated such that e.g. realtime_enter would be defined as the following?

pub fn realtime_enter() {
    #[cfg(all(sanitize = "realtime", any(target_os = "macos", target_os = "linux")))]
    unsafe { rtsan_standalone_sys::__rtsan_realtime_enter() };
}

Along with rtsan-standalone-sys being updated to not do anything in build.rs (because rustc already does it for it)?

(Remember that the sanitize = "..." cfg is already provided for the other sanitizers).

If so, then I don't see the reason why these attributes has to be exposed by rustc as a language feature?

I see what you're saying, I think! I agree this would make it easier to accept/deny this RFC, the most controversial part is definitely the addition of the attributes (and changes to assert, panic etc).

However, as an end user, I think it would be strange for the compiler to expose the flag, link the library and do nothing until I downloaded and used another third party crate (rtsan-standalone-rs). I think a good proposal from the end users perspective would allow them to do everything with only rustc, not relying on a third party crate.

I think the addition of the attributes into the standard would also encourage libraries to document their expectations - far in the future you could imagine just looking at the signature of functions and reasoning about if they are real-time safe because they have the attribute.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the most controversial part is definitely the addition of the attributes

Again, I'm not the one that's gonna make a decision here, but personally, I would like to evaluate the merits of such attributes separately from the actual sanitizer being added.

However, as an end user, I think it would be strange for the compiler to expose the flag
the addition of the attributes into the standard would also encourage libraries to document their expectations

That's fair, and I agree that having this in the language/std could have ecosystem effects, but then I think the RFC should be written around that; or at least have a section that says "why does this belong in the language/std".

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I'm not the one that's gonna make a decision here, but personally, I would like to evaluate the merits of such attributes separately from the actual sanitizer being added.

In our initial zulip thread, one of the compiler team members said we did not need an RFC for adding the sanitizer, only the attributes (see the thread). However, to piece together the RFC we thought it made most sense to provide all of the context in which the attributes will be used.

Definitely we expect most of the conversation here to be around the attributes!

I'll wait a bit to see what some others say about this. Thanks for the feedback so far, it's appreciated.

An example of an improper allocation being detected in a `nonblocking` function:
```rust
> cat example/src/main.rs
#[nonblocking]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Attribute naming thread

Existing comments: #3766 (comment) , #3766 (comment) , #3766 (comment)

I'll toss my hat in for realtime(constrained) to mark a "please validate this function" root, and realtime(blocking) to mark a function as forbidden underneath a validation root. Further naming work is needed to have separate validation classes for "blocking event loop" (target 20Hz 50ms) versus "blocking the frame timer" (target 120Hz 8ms - you cannot call Windows Thread sleep) versus "hard realtime" (no allocations)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading further, I see rtsan_scoped_disabler!. I already had text typed to propose realtime(unconstrained) for that usecase, but deleted it after I realized that wasn't the two primary attributes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like realtime(constrained) and realtime(blocking). I would be OK with that being the two.

Any thoughts other co-authors @steckes @davidtrevelyan?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get on board with this style, for sure! If we're happy with realtime(blocking), would we also be happy with realtime(nonblocking)? I suggest this in an effort to maintain the blocking/nonblocking symmetry and consistency with clang, while at the same time being explicit about it being realtime blocking/nonblocking (and thereby reducing confusion with other definitions of blocking and nonblocking). What do you think?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also like to keep it closer to the original documentation, so it is less confusing.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also support realtime(nonblocking/blocking). This may be the best of both worlds as it keeps the documentation green, and doesn't collide with any async work. Any thoughts @riking @clarfonthey @programmerjake?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems good to me!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated the doc to use realtime(nonblocking/blocking) because of the support of all original authors and no objections (thusfar). Of course this can be changed again if more feedback comes in :)

@programmerjake
Copy link
Member

  • rtsan(*)

I think we shouldn't name stuff after the verification mechanism, since that could be changed/extended to use something other than rtsan.

maybe name it hard_realtime() since that's where you'd want to disallow allocator calls and not merely sleep or println! or sometimes Mutex::lock

@cjappl
Copy link
Author

cjappl commented Jan 31, 2025

  • rtsan(*)

I think we shouldn't name stuff after the verification mechanism, since that could be changed/extended to use something other than rtsan.

Seems reasonable to me! Thanks for the feedback, I agree.

maybe name it hard_realtime() since that's where you'd want to disallow allocator calls and not merely sleep or println! or sometimes Mutex::lock

Differentiating between harder and softer realtime is out of the scope of rtsan as it exists today, it only has one level which is "disallow all non-deterministic time calls", which is just generally "realtime", encompassing all levels from hard-soft. I think this would also open up a lot of debate as to what is allowed at each level of hardness. The solution we have gone with is to disallow everything, and let users opt-out via the scoped_disabler or suppression lists if they want to allow allocations or similar.

@programmerjake
Copy link
Member

I think we shouldn't name stuff after the verification mechanism, since that could be changed/extended to use something other than rtsan.

Seems reasonable to me! Thanks for the feedback, I agree.

in that case rtsan_scoped_disabler! should be renamed too -- maybe to something like assume_realtime_unchecked!?

@steckes
Copy link

steckes commented Jan 31, 2025

I think we shouldn't name stuff after the verification mechanism, since that could be changed/extended to use something other than rtsan.

Seems reasonable to me! Thanks for the feedback, I agree.

in that case rtsan_scoped_disabler! should be renamed too -- maybe to something like assume_realtime_unchecked!?

maybe something that is more similar to the no_sanitize which is already existing in Rust, like no_sanitize_realtime?

Just to be clear, I understand the ambiguity with async, but renaming those things means that the official RealtimeSanitizer documentation does not apply for Rust anymore.
-> https://clang.llvm.org/docs/RealtimeSanitizer.html

It removes confusion in async but introduces confusion for people reading the LLVM docs.
I am open for renaming, just want to ask again if everyone still thinks the same when reading the official docs.


```

### The addition of the `rtsan_scoped_disabler!` macro
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@steckes @programmerjake starting a thread to discuss changing or not changing this macro name (so it's not in the main thread)


2. User defined functions marked `blocking`

The new `blocking` attribute allows users to mark a function as unsafe for real-time contexts.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This description elicits potential confusion with unsafe (memory safety). Does such a function become unsafe to call if I compile with RTSan enabled? I do not think it will?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No this just means real-time unsafe and has nothing to do with the unsafe keyword in Rust. Thanks for pointing this out.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll tweak the wording here to make it more obvious what we are talking about! unsafe definitely an overloaded term in general.

To be clear (and echo @steckes): real-time unsafe != memory unsafe/the unsafe rust uses.

text/0000-rtsan-and-attributes.md Outdated Show resolved Hide resolved
text/0000-rtsan-and-attributes.md Outdated Show resolved Hide resolved
text/0000-rtsan-and-attributes.md Outdated Show resolved Hide resolved
Comment on lines 197 to 221
If users rely on `panic!` or `assert!` while running under RealtimeSanitizer, they will hit an intercepted call before the message is printed.

For example:

```rust
#[nonblocking]
fn processor(buffer: &[f32]) {
buffer[512]; // Oops, out of bounds!! should panic!
}

processor(&[0.0; 512]);
```

Running under RTSan the user gets this message. This is due to some memory being allocated to prepare to print the panic message.
```
==31969==ERROR: RealtimeSanitizer: unsafe-library-call
Intercepted call to real-time unsafe function `malloc` in real-time context!
```

A user should expect this out of bounds access to print:
```
index out of bounds: the len is 512 but the index is 512
```

To adhere to this expected behavior, RTSan should be disabled for `panic!` and each of the assertion macros: `assert`, `assert_eq`, `assert_ne`, `debug_assert`, `debug_assert_eq`, `debug_assert_ne`.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that it would be possible to make panic! and the following backtrace process not allocate, but obviously it currently does.

@cjappl cjappl changed the title RFC: Implement RealtimeSanitizer (RTSan) support, add #[nonblocking], #[blocking] attributes RFC: Implement RealtimeSanitizer (RTSan) support, add #[realtime(nonblocking)], #[realtime(blocking)] attributes Feb 1, 2025
@cjappl
Copy link
Author

cjappl commented Feb 10, 2025

Seeing as it has been a week, I'm going to post a little ping/recap comment to encourage more conversation on this RFC.

What happened in the initial week:

  • We updated the attribute names to be #[realtime(nonblocking)] and #[realtime(blocking)] as there appeared to be general support. Discussion is still ongoing.
  • There is discussion on the naming of the rtsan_scoped_disabler! macro
  • We clarified some wording in the RFC, specifically around the use of the term "real-time unsafe" noting that unsafe is an important word to be specific with in rust-land.

Any more thoughts/ideas/input greatly appreciated. Thanks for the discussion so far.

CC: @steckes @davidtrevelyan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
T-lang Relevant to the language team, which will review and decide on the RFC.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants