Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new builtin: @volatileUse #6168

Open
andrewrk opened this issue Aug 26, 2020 · 7 comments
Open

new builtin: @volatileUse #6168

andrewrk opened this issue Aug 26, 2020 · 7 comments
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@andrewrk
Copy link
Member

andrewrk commented Aug 26, 2020

Use cases:

  • math (see forceEval) - used to make the CPU do float operations in case we want to guarantee triggering hardware interrupts or hardware-based error handling
  • benchmarks (see Improve crypto benchmarks #6164)

Status quo solution:

/// Force an evaluation of the expression; this tries to prevent
/// the compiler from optimizing the computation away even if the
/// result eventually gets discarded.
pub fn doNotOptimizeAway(val: anytype) void {
    asm volatile (""
        :
        : [val] "rm" (val)
        : "memory"
    );
} 
/// Statement. Declares that the very computation of the value is itself a side effect. This includes allocating
/// space in a register or memory location and copying the value into existence in the real, physical universe.
/// In a comptime scope, this builtin is a no-op, because in comptime scope, all computations do in fact occur,
/// and the value does in fact exist, for a period of time, within the compiler's memory.
@volatileUse(val: anytype)

The comptime behavior is consistent with the math use case (in fact improves it to allow those functions to execute at comptime), and not applicable to the benchmark use case.

The main benefits over using the status quo solution would be supporting types that have trouble getting sent into inline assembly, as well as working in comptime scope. It reduces the burden on new architectures being supported because implementation of this builtin would be done architecture-independent instead of relying on inline assembly which is necessarily architecture-specific. Downsides would be causing confusion about when to use this versus volatile and atomics, which people already get confused about.

This builtin would be defined to be equivalent to this Zig code, with dummy being intentionally inaccessible anywhere else:

fn volatileUse(val: anytype) void {
    @as(*volatile @TypeOf(val), &struct {
        var dummy: @TypeOf(val) = undefined;
    }.dummy).* = val;
}

However, it is to be a builtin so that backend-specific lowering nonsense doesn't end up in userland.

@andrewrk andrewrk added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Aug 26, 2020
@andrewrk andrewrk added this to the 0.7.0 milestone Aug 26, 2020
@jedisct1
Copy link
Contributor

This is a very good idea.

@CurtisFenner
Copy link

I'm not familiar with exactly what Zig guarantees with respect to inline assembly, so forgive me if this is already explained somewhere. What is the precise guarantee that this intrinsic is supposed to provide?

For example, if you have code like

const r = @declareSideEffect(a + b)

does that guarantee that the floating point addition a + b is actually performed? Or does it just guarantee that the expected resulting value is actually written in-memory to a destination on the stack/register named r? (e.g., what if b is statically known to be 0?)

If you have

fn add(a: var, b: var) @typeOf(a, b) {
    return a + b;
}

const r = @declareSideEffect(add(a, b))

does that guarantee that add is called (i.e., that the stack is grown, parameters are passed through the stack/registers, and a variable is returned through stack/registers)? Does that guarantee that the addition + inside add is actually performed? Or does it just guarantee that the expected resulting value is in-memory? What about

fn add(a: var, b: var) @typeOf(a, b) {
    return a + 0.0 + b;
}

const r = @declareSideEffect(a, b)

does this guarantee that the + 0.0 is performed, by restricting the optimizations that can happen in the invoked function?

Do the answer to these questions change if you're talking about inline assembly, local variable invocations, internal pointer assignments?

@jedisct1
Copy link
Contributor

It guarantees that the final value of r is computed, not matter how it is computed.

Without @declareSideEffect, the computation would not happen at all if the compiler can infer that the result is not going to be used.

@SpexGuy
Copy link
Contributor

SpexGuy commented Aug 27, 2020

@CurtisFenner Assuming this behaves like the asm version, all it guarantees is that the result of the computation exists in a register. If the result of the computation is a pointer, it also escapes the data that is pointed to, which forces future memory clobbers to invalidate that data. So to answer the questions,

const c = ...;
const a = c - b;
const r = @declareSideEffect(a + b);

In this case, the compiler may realize that a + b is just c, and optimize away both c-b and a+b. But it will not mark the computation of c as dead code, even if r is never used.

fn add(a: var, b: var) @typeOf(a, b) {
    return a + 0.0 + b;
}
const r = @declareSideEffect(add(a, b));

Again, assuming the floating point rules allow the optimizer to remove + 0.0, it will. It may also inline the add() function. But the value that that function would return must actually be computed and cannot be marked as dead code.


If we can implement @declareSideEffect(someStruct), it would allow for potentially better benchmarks than other languages. The closest you can do without this is the equivalent of @declareSideEffect(&someStruct), but this forces the (potentially destructured) struct fields to be stored into memory according to the struct's layout, and also escapes that memory, meaning that any future memory clobbers will also clobber the struct data. This can cause extra reads and writes that aren't necessary to the benchmark, and prevent other optimizations. With the ability to handle a struct by value, we avoid escaping the struct memory and just ensure that each field is computed, escaping any memory the struct references but not the struct itself.


There are some other fine-grained clobber functions that might be nice to include, either as functionality in this builtin or as a small family of builtins. I'll go through examples of those from other benchmarking libraries below:


Google's Benchmark library has a second function, ClobberMemory, implemented as

asm volatile("" : : : "memory");

This clobbers any escaped memory without forcing a specific pointer value to be computed. It can be useful in cases where pointers may alias, but you don't want to force the pointer to be computed if the compiler can prove no aliasing.


Looking at Facebook's Folly benchmarking library, they have four variants of this function (slightly modified for clarity, /// comments mine)

/// for non-pointer types that fit in registers
/// ensures that the value is not marked as dead code
template <typename T>
void doNotOptimizeAwayPrimitive(const T& datum) {
  // The "r" constraint forces the compiler to make datum available
  // in a register to the asm block, which means that it must have
  // computed/loaded it.  We use this path for things that are <=
  // sizeof(long) (they have to fit), trivial (otherwise the compiler
  // doesn't want to put them in a register), and not a pointer (because
  // doNotOptimizeAway(&foo) would otherwise be a foot gun that didn't
  // necessarily compute foo).
  asm volatile("" ::"r"(datum));
}

/// for pointer types (T may not be a pointer, the pointer is the reference)
/// ensures that the pointer value is not marked as dead code
/// also escapes this pointer and clobbers memory, making the
/// compiler think that the data that is pointed to may have been modified
template <typename T>
void doNotOptimizeAwayPointer(const T& datum) {
  // This version of doNotOptimizeAway tells the compiler that the asm
  // block will read datum from memory, and that in addition it might read
  // or write from any memory location.  If the memory clobber could be
  // separated into input and output that would be preferrable.
  asm volatile("" ::"m"(datum) : "memory");
}

/// for types that fit in a register, make the compiler think that the value
/// was read and then modified.
template <typename T>
void makeUnpredictablePrimitive(T& datum) {
  /// "+r" marks the datum as both an input and an output
  asm volatile("" : "+r"(datum));
}

/// for types that are too large for a register, make the compiler think
/// that the memory region containing the value was read and then
/// modified.  This also escapes the memory region and clobbers
/// any other escaped regions.  This does not clobber the actual
/// pointer value, just the data that it points to.
template <typename T>
void makeUnpredictablePointer(T& datum) {
  asm volatile("" ::"m"(datum) : "memory");
}

With these, they are trying to make more fine grained tools for performing the exact type of invalidation that is needed. But these tools are imperfect. In doNotOptimizeAwayPointer, the author laments that the memory clobber is too coarse grained. This also clobbers any other escaped pointers, not just the memory that is being escaped here. It also makes the compiler think that the memory being escaped may have been written to, potentially causing unnecessary reads after this call.


Overall, if we're looking at integrating clobbers into the language, having some more finely tuned clobbers might be beneficial.

xxxbxxx added a commit to xxxbxxx/zig that referenced this issue Oct 7, 2020
until ziglang#6168 is implemented,
partially revert 0bd53dd
in order to restore the ability to use std.math in comptime functions.
andrewrk pushed a commit that referenced this issue Oct 7, 2020
until #6168 is implemented,
partially revert 0bd53dd
in order to restore the ability to use std.math in comptime functions.
@andrewrk andrewrk modified the milestones: 0.7.0, 0.8.0 Oct 26, 2020
@andrewrk andrewrk modified the milestones: 0.8.0, 0.9.0 May 19, 2021
@andrewrk andrewrk modified the milestones: 0.9.0, 0.10.0 Nov 23, 2021
@andrewrk andrewrk modified the milestones: 0.10.0, 0.11.0 Apr 16, 2022
@andrewrk andrewrk modified the milestones: 0.11.0, 0.12.0 Apr 9, 2023
@andrewrk andrewrk modified the milestones: 0.13.0, 0.12.0 Jul 9, 2023
@andrewrk andrewrk changed the title new builtin: @declareSideEffect new builtin: @volatileUse Dec 3, 2024
@andrewrk andrewrk added the accepted This proposal is planned. label Dec 3, 2024
@wooster0
Copy link
Contributor

Another option: reuse the volatile keyword and, when not used as a pointer attribute, make it usable like this:

fn x(y: *u8) void {
    // This will at runtime perform some addition operation regardless of optimize mode:
    volatile 1 + 1;
    // This will write to the pointer for sure, which means in this case the pointer
    // can be used like a `*volatile u8` pointer without needing to cast.
    // This is useful when you need the operation to happen for sure only sometimes.  
    volatile y.* = 1;
}

Additionally, this would also simplify asm volatile (...); to volatile asm (...); (asm itself no longer needs to think about volatile, because volatility becomes a separate thing reused for other cases too).

And the keyword can of course also be used for blocks to make sure all operations in a block happen:

volatile { ... }

@alexrp
Copy link
Member

alexrp commented Dec 30, 2024

That's an interesting take, but I don't know what we would lower it to for LLVM once it's so generalized.

@wooster0
Copy link
Contributor

I was thinking about it mostly as a syntactical difference, avoiding a new builtin. But I guess the fact that @volatileUse(y.* = 1) would not work and volatile y.* = 1; would does make this a bit different and maybe more difficult to lower. Perhaps it can be changed from accepting statements (like comptime) to only accepting values (like return). Though having it accept statements would be useful I think.

@andrewrk andrewrk modified the milestones: 0.14.0, 0.15.0 Jan 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepted This proposal is planned. proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

6 participants