-
Notifications
You must be signed in to change notification settings - Fork 931
Support for C-style discriminated unions and alignment/padding. #2283
Comments
I'm actually gonna start working on two new composable codecs soon one of which is the "offset" codec allowing you to temporarily move the cursor up or down. This means, if all your enum variants have the same size, then you can use that size to offset the cursor and get the enum variant. However, if your enum variants don't all share the same size, then I'm not sure how you can achieve this serialisation process since you need the variant to know where to start reading the variant. Also note that the data enum codec allows you specific custom sizes such as u64 so you could achieve byte alignment and therefore zero-copy using a prefix instead of a suffix size as well. |
Right - all enum variants in this case are of the same size for C discriminated unions (the max size of all enum variants). The offset codec sounds like a good solution - can’t wait to replace my current solution with it :). |
Hey, just wanted to let you know I submitted a PR for the Let me know what you think: #2294. EDIT: oh and here's an example of a string codec that stores its size at the end of the buffer using absolute offsets. solana-web3.js/packages/codecs-strings/src/__tests__/string-test.ts Lines 101 to 111 in 9b9a133
|
Closing this issue since |
Because there has been no activity on this issue for 7 days since it was closed, it has been automatically locked. Please open a new issue if it requires a follow up. |
There exists Solana programs that encode/decode accounts whose data is represented by a C-style discriminated union.
Discriminated unions in C contain their numeric discriminator (of any fixed size, i.e. u8, u16, u32) in their byte representation after their fields, rather than before their fields like in i.e. Borsh.
The size of a C discriminated union is fixed to the max of all of its fields + discriminator length.
There may optionally be padding of fields such that each field is padded to the size of a pointer (for Solana BPF VM case, 8 bytes).
The motivation for representing account data as C types is that they are inherently zero-copy and fixed in byte length, such that encoding/decoding them costs very little instructions in comparison to encoding/decoding with a codec scheme like Borsh.
getDataEnumCodec may be modified to contain options to allow for its discriminator to be appended after all its payload fields, and to support alignment/padding.
getStructCodec may be modified to support alignment/padding.
I am currently writing custom implementations of getDataEnumCodec to support C-style discriminated unions and will post them when I have verified that they work as intended. I am still in the middle of checking for gotchas regarding alignment/padding for my own implementation.
The text was updated successfully, but these errors were encountered: