-
-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make enums and tuples stack allocated types #780
Comments
One option is that we could allow borrowing of stack allocated data by copying the outer "shell" that's on the stack, and incrementing the borrow count for any interior heap data. We then expose the resulting type as This approach means we're essentially using the interior heap data to reference count the stack data, and would allow safe borrowing of the data using our existing runtime mechanism. The downside is that the meaning of In other words, it's a clever trick but I'm not sure if it's the right trick. EDIT: in addition to the above, this approach poses a problem for field assignments: if a field is assigned a new value through a borrow, it would only affect the borrow clone and not the original. This would be the source of many bugs. |
Another approach is to do the following:
This helps with the common cases, though e.g. custom more complex enums would still go on the heap. On the other hand, it won't require potentially drastic language changes. |
I played around with trying to infer I'm not sure the idea as a whole makes sense either. Fundamentally changing the semantics of a type based on how its used feels a bit odd, and due to inference being a heuristic there will still be cases where we'd end up heap allocating enums/tuples even if this isn't strictly necessary. |
Since it might be brought up at some point: using Vale's generational references could in theory be an approach. That is, the idea is that instead of using ref counts, each object stores some sort of generation ID. Borrows then use fat pointers that encode the ID set at allocation time (or inherit it when borrowing from an existing borrow). Upon dereference, you check if the ID in the pointer equals the first 4-8 bytes of the pointee. If this isn't the case, you produce an error. The problem with this approach is that it hinges on the assumption nothing will ever overwrite the pointee after it's no longer in use such that any dangling borrows may be tricked into thinking they still point to valid/relevant data. This is terribly unsound and may result in silent incorrect behaviour. This isn't too difficult to demonstrate either. For example, if the ID is a 64-bits integer with value 12379821, and at some point we reuse the data to write an 8 bits integer with value 173 followed by more data (but such that the rest of the ID isn't overwritten), then any dangling references will still observe the same IDs even though the data is different. This is because the 64-bits integer 12379821 and the 8 bits integer 173 share the same bit patterns for the 8 least significant bits:
For values on the heap the chances/frequency of running into this may be less, but for stack data it's much easier to run into this because of how frequent the same chunk of memory is reused. Inko's model on the other hand doesn't suffer from these issues:
|
Regarding #780 (comment): I think this has merit after all. That is, if The requirement to make this work though is to disallow assigning new field values when using stack allocated types. You can still modify heap values in-place though, as those operate through a pointer. Since assigning fields new values entirely isn't nearly as common as in-place mutations, I think that could be fine. Implementation wise this means |
Using the above setup, the semantics of a stack allocated reference are as follows:
|
In this setup, the decision between a heap and stack type comes down to the following: Use heap types if:
Stack types on the other hand would be used for the exact opposite cases. This then does bring the question: should heap types be the default, or stack types? Since heap types come with an extra allocation cost, perhaps it's better to default to allocating values on the stack? |
Assuming we default to allocating data on the stack, we could use the following (replacing
A more practical example would be a doubly linked list: a node may need to mutably borrow its predecessor and be able to mutate it in place. This requires indirection, and thus a boxed type is necessary:
For enums and tuples we'd then define them as
The downside as far as compile times goes is that we'll have to generate more specialized versions of types and methods as stack allocated data is specialized by type rather than a common shape. This is ultimately a trade-off I think we have to make, and one can still use In addition, for types that store many heap fields a borrow becomes more expensive. For example, if some type |
The need for incurring interior reference count increments could be considered a potential footgun. For example, While we probably could optimize away many such increments, it's still a potential performance bottleneck. |
Another thing to keep in mind: if a stack type contains some This in fact poses a challenge for enums: if enums go on the stack, and stack borrows involve an outer copy, then we can't store |
One option is that for a stack allocated type they can store What we strictly can't do is expose the borrow as With the above restriction in mind, defaulting to the heap may be a better choice, such that we'd end up with the following:
|
We don't actually need the EDIT: this isn't entirely true: |
I started work on this in #783. |
Description
#778 introduces the ability to define stack allocated, immutable value types. As part of that PR I also considered inferring enums and tuples as stack allocated when they are value types, but I think it would be better to always make them stack allocated types.
The challenge this brings is that in our current setup we can't allow borrowing of data on the stack, as we have no compile-time mechanism to ensure the borrowed data isn't moved while a borrow exists.
We'll need to find a way such that enums and tuples can always be allocated on the stack, while still allowing them to contain non-stack data.
Related issues
class
keyword with thetype
keyword #779Related work
No response
The text was updated successfully, but these errors were encountered: