Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make enums and tuples stack allocated types #780

Closed
yorickpeterse opened this issue Nov 27, 2024 · 13 comments
Closed

Make enums and tuples stack allocated types #780

yorickpeterse opened this issue Nov 27, 2024 · 13 comments
Assignees
Labels
compiler Changes related to the compiler feature New things to add to Inko, such as a new standard library module performance Changes related to improving performance
Milestone

Comments

@yorickpeterse
Copy link
Collaborator

yorickpeterse commented Nov 27, 2024

Description

#778 introduces the ability to define stack allocated, immutable value types. As part of that PR I also considered inferring enums and tuples as stack allocated when they are value types, but I think it would be better to always make them stack allocated types.

The challenge this brings is that in our current setup we can't allow borrowing of data on the stack, as we have no compile-time mechanism to ensure the borrowed data isn't moved while a borrow exists.

We'll need to find a way such that enums and tuples can always be allocated on the stack, while still allowing them to contain non-stack data.

Related issues

Related work

No response

@yorickpeterse yorickpeterse added feature New things to add to Inko, such as a new standard library module performance Changes related to improving performance compiler Changes related to the compiler labels Nov 27, 2024
@yorickpeterse yorickpeterse added this to the 0.18.0 milestone Nov 27, 2024
@yorickpeterse yorickpeterse self-assigned this Nov 27, 2024
@yorickpeterse
Copy link
Collaborator Author

yorickpeterse commented Nov 29, 2024

One option is that we could allow borrowing of stack allocated data by copying the outer "shell" that's on the stack, and incrementing the borrow count for any interior heap data. We then expose the resulting type as ref T/mut T instead of T, and when dropping such a type we decrement the borrow counts.

This approach means we're essentially using the interior heap data to reference count the stack data, and would allow safe borrowing of the data using our existing runtime mechanism. The downside is that the meaning of ref SomeStackType now becomes blurry: currently it's collapsed into SomeStackType because they're value types, but in the above setup it would mean "a copy of the outer data but with interior borrows". This can make it unclear if you should use SomeStackType or ref SomeStackType when e.g. defining arguments.

In other words, it's a clever trick but I'm not sure if it's the right trick.

EDIT: in addition to the above, this approach poses a problem for field assignments: if a field is assigned a new value through a borrow, it would only affect the borrow clone and not the original. This would be the source of many bugs.

@yorickpeterse
Copy link
Collaborator Author

yorickpeterse commented Nov 29, 2024

Another approach is to do the following:

  • We disallow assigning tuple fields new values by special-casing this in the compiler
  • We infer enums and tuples as inline if all their values are also inline
  • Values allocated and dropped in the same method are allocated on the stack, and we somehow "pin" their memory so LLVM doesn't implicitly move the data around and mess up any borrows (assuming this is necessary)
  • For more complex cases we could go with more complex inter-procedural escape analysis (Implement some form of escape analysis to allocate values on the stack #776), though I'm still unsure as to how much we'd benefit from this
  • Option[T] where T is a pointer type is optimized into T | NULL. This requires generating different MIR instructions for such types
  • Similarly, Result[A,B] could be optimized into a tagged pointer if both A and B are pointers, where the least significant bit indicates which case we're dealing with
    • This can in fact be done for any enum with at most 8 constructors and only one argument, as in this case we can use the lower three bits for the constructor tag

This helps with the common cases, though e.g. custom more complex enums would still go on the heap. On the other hand, it won't require potentially drastic language changes.

@yorickpeterse
Copy link
Collaborator Author

I played around with trying to infer enum types as inline when possible. This proves rather tricky, as there are a whole bunch of places we check for things like "is this a value type?" or "is this a stack type?", and all those need to be adjusted accordingly. In particular, the current WIP setup results in type specialization generating incorrectly specialized methods, resulting in a compile time panic.

I'm not sure the idea as a whole makes sense either. Fundamentally changing the semantics of a type based on how its used feels a bit odd, and due to inference being a heuristic there will still be cases where we'd end up heap allocating enums/tuples even if this isn't strictly necessary.

@yorickpeterse
Copy link
Collaborator Author

yorickpeterse commented Dec 1, 2024

Since it might be brought up at some point: using Vale's generational references could in theory be an approach. That is, the idea is that instead of using ref counts, each object stores some sort of generation ID. Borrows then use fat pointers that encode the ID set at allocation time (or inherit it when borrowing from an existing borrow). Upon dereference, you check if the ID in the pointer equals the first 4-8 bytes of the pointee. If this isn't the case, you produce an error.

The problem with this approach is that it hinges on the assumption nothing will ever overwrite the pointee after it's no longer in use such that any dangling borrows may be tricked into thinking they still point to valid/relevant data. This is terribly unsound and may result in silent incorrect behaviour. This isn't too difficult to demonstrate either.

For example, if the ID is a 64-bits integer with value 12379821, and at some point we reuse the data to write an 8 bits integer with value 173 followed by more data (but such that the rest of the ID isn't overwritten), then any dangling references will still observe the same IDs even though the data is different. This is because the 64-bits integer 12379821 and the 8 bits integer 173 share the same bit patterns for the 8 least significant bits:

12379821 = 0000000000000000000000000000000000000000101111001110011010101101
     173 =                                                         10101101

For values on the heap the chances/frequency of running into this may be less, but for stack data it's much easier to run into this because of how frequent the same chunk of memory is reused.

Inko's model on the other hand doesn't suffer from these issues:

  1. If borrows still exist when an owned value is dropped, we panic, instead of silently continuing
  2. While the u32 borrow counter can overflow, this requires 32 GiB of memory for just the borrow pointers. This is something we could in theory check for, but since the chances of running into this are so astronomically slow it's not worth it

@yorickpeterse
Copy link
Collaborator Author

Regarding #780 (comment): I think this has merit after all. That is, if inline simply means "allocate on the stack but not necessarily treat it as a value type", then ref SomeStackType borrowing the interior data shouldn't be surprising, because it's essentially the same as if the type were a heap type.

The requirement to make this work though is to disallow assigning new field values when using stack allocated types. You can still modify heap values in-place though, as those operate through a pointer. Since assigning fields new values entirely isn't nearly as common as in-place mutations, I think that could be fine.

Implementation wise this means inline means "put on the stack", and we'd need a separate keyword to also turn the type into a value type. What we could do is replace let with val, then we can use class val Foo (or type val Foo) in the future to signal a value type, while keeping the number of keywords the same. The downside is that val number = 42 may then look a little confusing, because it just means "define a variable with value 42".

@yorickpeterse
Copy link
Collaborator Author

Using the above setup, the semantics of a stack allocated reference are as follows:

  • T: owned, when dropped we run its destructor similarly to heap types. The only difference is the type resides on the stack and has no object header, similar to our current inline value type setup.
  • ref T / mut T: immutable borrow. The data on the stack is copied as part of the borrow, and the borrow count for any heap types stored in fields is incremented. When dropped, we decrement the borrow count.
  • fn and fn mut methods similarly act on an outer copy instead of a pointer, otherwise they may invalidate the pointer through a mutation
    • For fn methods with only immutable arguments, we can pass a pointer as we'll never invalidate any stack data
  • Field assignments won't be allowed
  • We still generate a dropper as usual and check the borrow counts. In other words, inline T for all intents and purposes is the same as T except it resides on the stack and won't have an object header

@yorickpeterse
Copy link
Collaborator Author

In this setup, the decision between a heap and stack type comes down to the following:

Use heap types if:

  • The type is recursive
  • The type is large (e.g. larger than 256 bytes or something along those lines)
  • The type has many fields (regardless of their size), as those need to be incremented upon a borrow
  • You want to be able to assign fields new values
  • The lifetime is likely to be long/the data is going to be passed around a lot, as using a heap type in this case saves you stack space and (possibly) reduces the amount of registers to use for it
  • You want to use the type in generic contexts a lot, but don't want to pay the specialization/monomorphization cost

Stack types on the other hand would be used for the exact opposite cases.

This then does bring the question: should heap types be the default, or stack types? Since heap types come with an extra allocation cost, perhaps it's better to default to allocating values on the stack?

@yorickpeterse
Copy link
Collaborator Author

yorickpeterse commented Dec 2, 2024

Assuming we default to allocating data on the stack, we could use the following (replacing let with val in the process, to cut down the amount of keywords):

  • type Foo:
    • Goes on the stack
    • Not a value type
    • Disallows field assignments
    • Disallows recursive types
    • Specialized using static dispatch
    • No object header
    • Can't be casted to traits
    • Borrowing incurs a copy of the stack allocated data, and an increment for any fields that store box values
  • type val Foo:
    • Goes on the stack
    • Is an immutable value type (currently class inline)
    • Disallows recursive types
    • Only allows other val types to be stored within (Int, etc)
    • Specialized using static dispatch
    • No object header
    • Can't be casted to traits
    • Borrowing results in a bitwise copy
  • type box Foo:
    • Goes on the heap
    • Not a value type
    • Allows field assignments
    • Allows recursive types
    • Allows for self-referential types
    • Specialized using dynamic dispatch
    • Has an object header
    • Can be casted to traits
    • Borrowing incurs a single increment

A more practical example would be a doubly linked list: a node may need to mutably borrow its predecessor and be able to mutate it in place. This requires indirection, and thus a boxed type is necessary:

type box Node[T] {
  val @value: T
  val @next: Option[Node[T]]
  val @prev: Option[mut Node[T]]
}

For enums and tuples we'd then define them as type enum, removing the need for any sort of special-casing of these types. For enums you can still choose to box them like so:

type box enum Example {
  case A(Int)
  case B(Example)
  case C
}

The downside as far as compile times goes is that we'll have to generate more specialized versions of types and methods as stack allocated data is specialized by type rather than a common shape. This is ultimately a trade-off I think we have to make, and one can still use type box to specialize through dynamic dispatch.

In addition, for types that store many heap fields a borrow becomes more expensive. For example, if some type Foo stores 45 fields containing box types, a borrow of Foo incurs 45 reference count increments upon borrowing, and 45 decrements upon dropping the borrow. This could make borrowing such values expensive, though I suspect that over time we'll be able to optimize those increments away for the most part.

@yorickpeterse
Copy link
Collaborator Author

The need for incurring interior reference count increments could be considered a potential footgun. For example, Map.iter yields an Entry for each value. If that value is a stack allocated type, then every time you refer to it you incur an increment for all interior fields that store heap data.

While we probably could optimize away many such increments, it's still a potential performance bottleneck.

@yorickpeterse
Copy link
Collaborator Author

yorickpeterse commented Dec 2, 2024

Another thing to keep in mind: if a stack type contains some uni T, then we can't borrow the stack type because that would result in violating the uniqueness constraints of the uni T. This means that in general you can only store uni T values in heap allocated types. This in turn means that stack allocated types might not be ideal as a default, as doing so would further complicate working with uni T values.

This in fact poses a challenge for enums: if enums go on the stack, and stack borrows involve an outer copy, then we can't store uni T values inside enums, because borrowing the enum would result in aliasing the uni T value which isn't allowed.

@yorickpeterse
Copy link
Collaborator Author

One option is that for a stack allocated type they can store uni T values, but doing so prohibits borrowing the stack type as a whole. This means e.g. Option[uni User] is perfectly valid, but ref Option[uni User] isn't.

What we strictly can't do is expose the borrow as Option[uni ref User] because this would allow moving the original uni User to a different process, resulting in concurrent access to the same data. This is different from storing a uni User in a regular heap type, because you can only move that uni User to a different process by moving it out of the owning type, and you can't alias an uni T directly in a way that the alias outlives such a move.

With the above restriction in mind, defaulting to the heap may be a better choice, such that we'd end up with the following:

  • type Foo: on the heap, not a value type
  • type inline Foo: on the stack, not a value type
  • type val Foo: on the stack, a value type

@yorickpeterse
Copy link
Collaborator Author

yorickpeterse commented Dec 3, 2024

We don't actually need the type val Foo syntax: if type inline makes something stack allocated, and the instance happens to contain only value types, it's semantically the same as the proposed type val Foo syntax. This means we don't need to introduce the val keyword.

EDIT: this isn't entirely true: type val Foo means the type is a value type that is implicitly copied around, allowing multiple instances of the same value. type inline just means it's on the stack, but still subject to single ownership.

@yorickpeterse
Copy link
Collaborator Author

I started work on this in #783.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler Changes related to the compiler feature New things to add to Inko, such as a new standard library module performance Changes related to improving performance
Projects
None yet
Development

No branches or pull requests

1 participant