Refactoring the write barrier API #1247

wks · 2024-12-04T09:44:28Z

TL;DR: This issue addresses some recent discussion about write barriers and omitting the slot and target parameters of the write barrier API functions.

We have some previous discussion about generalizing the subsuming barrier API for tagged references and atomic RMW (including CAS). This issue does not discuss subsuming barrier in depth, except acknowledging that the most general form of subsuming barrier sucks.

The most general write barrier API sucks!

The write barrier that is ultimately general, w.r.t. the field representation (pointer, compressed pointer, offsetted pointer, tagged pointer, handle, etc.), the operation (store, compare-and-swap, atomic exchange, etc.), whether the object is multi-copy (like Sapphire where write barriers write to both the old copy and the new copy), whether non-reference fields need barriers (like Sapphire), and the kind of barrier (object-logging barrier, field-logging barrier, SATB barrier, XOR zone barrier, generational barrier, etc.), is a subsuming barrier that lets the VM binding implement the actual write operation, has multiple object fields, an optional slot field, an optional and it can be very complicated.

fn object_reference_write(mutator: Mutator,
    /// The object
    object: ObjectReference,
    /// For GC like Sapphire, the "new copy" or "old copy" of the current object
    mirrored_object: Option<ObjectReference>,
    /// Only used by field-logging barriers (LXR).  Other barriers just let `operation` do the actual write.
    slot_addr: Option<Address>,
    /// The old target in the slot, or None if the slot was holding NULL, None, nil, false, true, nothing, missing, undef, small integer, etc.
    old_target: Option<ObjectReference>,
    /// The target object, or None if storing NULL, None, nil, false, true, nothing, missing, undef, small integer, etc.
    new_target: Option<ObjectReference>,
    /// A routine provided by VM binding to do the actual write/swap/CAS.
    /// Return the actual old target if different from the `old_target` argument.
    /// This can happen in CAS.
    operation: FnOnce() -> Option<ObjectReference>);

(p.s. Ask @wks for an example of SATB barrier for AtomicReference.compareAndExchangeAcquire in OpenJDK, or figure it out by yourself)

An API like this should be able to handle plans like GenImmix (ObjectBarrier), LXR (FieldBarrier), CMS (SatbBarrier), Sapphire (multi-write barrier), G1 (XOR zone barrier), @wenyuzhao's hypothetical alternative generational barrier design (gerational barrier), etc. and handle VMs like OpenJDK (needs atomic swap and CAS), CRuby (needs tagged reference), V8 (needs tagged reference and multiple flavors of NULL values), etc.

But an API like this will surely scare away 9 out of 10 PhD students or even professors in the field of language/VM implementation, not to mention developers who has "absolutely no idea how to write a programming language".

What's worse, if a VM wants to be fully general, it will need to apply such a subsuming barrier for every non-reference field write, too, just in case the current plan is Sapphire. But that'll slow down all programs, perhaps too slow even for debug builds.

What should we do?

Be practical. Provide a few flavors of pre-post barriers.

MMTk currently only has the ObjectBarrier in the master branch, and it has the field-logging barrier in the lxr branch. Considering common SATB barriers, advancing/retreating barriers for concurrent MS, I think a few kinds of barrier API functions will be sufficient to cover all barriers we currently have, and should be general enough for additional kinds of barriers.

Barrier forms

fn object_reference_write_pre_o(mutator: Mutator, object: ObjectReference);
fn object_reference_write_post_o(mutator: Mutator, object: ObjectReference);
fn object_reference_write_pre_ot(mutator: Mutator, object: ObjectReference, old_target: Option<ObjectReference>);
fn object_reference_write_post_ot(mutator: Mutator, object: ObjectReference, new_target: Option<ObjectReference>);
fn object_reference_write_pre_os(mutator: Mutator, object: ObjectReference, slot_addr: Address);
fn object_reference_write_post_os(mutator: Mutator, object: ObjectReference, slot_addr: Address);
fn object_reference_write_pre_ost(mutator: Mutator, object: ObjectReference, slot_addr: Address, old_target: Option<ObjectReference>);
fn object_reference_write_post_ost(mutator: Mutator, object: ObjectReference, slot_addr: Address, new_target: Option<ObjectReference>);

The suffix o, s and t means object, field_addr and target, respectively. The pre barriers only take old target, while post barriers only take new targets. Target can be None if it is a NULL, None, nil, nothing, missing, undef, true, false, small integer, symbol, etc.

The o form can support ObjectBarrier. It only needs to log the object.
The os form can support field-logging barrier. It needs the address of the field in order to access side metadata. Note that it is not the Slot type which is intended for updating an object graph edge using the Slot::store method, and a Slot may not necessarily be inside the MMTk heap (can be on the stack or in malloc memory).
The ot form can support barriers that need to access the target, such as the SATB barrier (enqueues the old target), the Dijistra/Steel-style grey mutator barriers (inspects/changes the color of the new target), the XOR zone barrier (compute old_target XOR new_target), etc.
The ost form is the most general form.

Simplify the API by merging the forms

We can merge those into just two functions:

fn object_reference_write_pre_ost(mutator: Mutator, object: ObjectReference, slot_addr: Option<Address>, old_target: Option<ObjectReference>);
fn object_reference_write_post_ost(mutator: Mutator, object: ObjectReference, slot_addr: Option<Address>, new_target: Option<ObjectReference>);

That is, we use Option<T> to make both the slot_addr and the {old,new}_target optional. We explicitly write into the documentation that if a VM is not able to provide either of those fields, or if the VM knows the barrier (such as ObjectBarrier) doesn't need the slot_addr or the new_target, it can just pass None to them.

This is actually very similar to what we currently have. We currently have a non-optional slot: Slot parameter, and that's probably the only thing that needs to be changed.

We can also create a InteriorPointer type to make it even clearer by making slot_addr an Option<InteriorPointer>. It emphasizes that if it is Some(iptr), the iptr must be in MMTk heap (i.e. not in malloc heap, not on the stack, and not NULL).

Barrier form and Barrier semantics

When using a barrier semantics that needs less information, the VM can invoke a form that provides more information, and it still works.

For example, when using the ObjectBarrier, the VM can actually call obj_reference_write_post_ost(m, obj, slot, target), and the ObjectBarrier simply ignores slot and the target.

When using a barrier semantics that needs more information, but the VM is only able to provide less information, it may or may not work, depending on GC algorithms.

For example, when using field-logging barrier, but the VM can only identify the object that is changed because the field is accessed by C extensions or the field is off-heap in malloc memory (It happens in CRuby), it will not be able to log the fields.
- But if we are implementing LXR or other coalescing RC, we can fall back to object-remembering if we can't do field-remembering for a particular object (or field). It will end up remembering more fields, but it still works.
For SATB barrier, if the VM only tells MMTk an object is modified, it can conservatively enqueue all children of object. It is slower when executed, but it is still correct, and it won't even keep more objects alive than the "snapshot in the beginning".

What about...

What about atomic swap and CAS?

This is a bit complicated. Due to concurrent access, we only know the actual "old value" after we do the swap or CAS. So if we do this:

let old_target = field.load();
let old_target2 = field.swap(new_target);

Then old_target may be different from old_target2.

It remains a question whether the write barrier actually needs the precise old_target2 at all. For SATB barrier, it doesn't because we only need to record the snapshot at the beginning. So only the oldest target matters. But if we do naive RC (it sucks anyway), we'll need the precise old_target2 to do the decrement.

What about Sapphire?

Those forms don't cover the need to apply barriers for non-reference fields, or the need to write to two copies of the same object. It's not on our agenda.

What about other subsuming barriers?

They are discussed in #1038. The main idea is, making a general API for multiple operations (store, swap, CAS, with acquire/release/seqcst orders), multiple field layout (fat, tagged, ofsetted, compressed, handles, etc.) and non-reference values (tagged integers, true, false, multiple NULL flavors like nil, nothing, missing, undef, etc.) can be very complicated.

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactoring the write barrier API #1247

Refactoring the write barrier API #1247

wks commented Dec 4, 2024

Refactoring the write barrier API #1247

Refactoring the write barrier API #1247

Comments

wks commented Dec 4, 2024

The most general write barrier API sucks!

What should we do?

Barrier forms

Simplify the API by merging the forms

Barrier form and Barrier semantics

What about...

What about atomic swap and CAS?

What about Sapphire?

What about other subsuming barriers?