Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

Support for C-style discriminated unions and alignment/padding. #2283

Closed
lithdew opened this issue Mar 8, 2024 · 5 comments
Closed

Support for C-style discriminated unions and alignment/padding. #2283

lithdew opened this issue Mar 8, 2024 · 5 comments
Labels
enhancement New feature or request

Comments

@lithdew
Copy link

lithdew commented Mar 8, 2024

There exists Solana programs that encode/decode accounts whose data is represented by a C-style discriminated union.

Discriminated unions in C contain their numeric discriminator (of any fixed size, i.e. u8, u16, u32) in their byte representation after their fields, rather than before their fields like in i.e. Borsh.

The size of a C discriminated union is fixed to the max of all of its fields + discriminator length.

There may optionally be padding of fields such that each field is padded to the size of a pointer (for Solana BPF VM case, 8 bytes).

The motivation for representing account data as C types is that they are inherently zero-copy and fixed in byte length, such that encoding/decoding them costs very little instructions in comparison to encoding/decoding with a codec scheme like Borsh.

getDataEnumCodec may be modified to contain options to allow for its discriminator to be appended after all its payload fields, and to support alignment/padding.

getStructCodec may be modified to support alignment/padding.

I am currently writing custom implementations of getDataEnumCodec to support C-style discriminated unions and will post them when I have verified that they work as intended. I am still in the middle of checking for gotchas regarding alignment/padding for my own implementation.

@lithdew lithdew added the enhancement New feature or request label Mar 8, 2024
@lithdew lithdew changed the title Support for C-style discriminated unions and padding. Support for C-style discriminated unions and alignment/padding. Mar 8, 2024
@lorisleiva
Copy link
Contributor

I'm actually gonna start working on two new composable codecs soon one of which is the "offset" codec allowing you to temporarily move the cursor up or down.

This means, if all your enum variants have the same size, then you can use that size to offset the cursor and get the enum variant.

However, if your enum variants don't all share the same size, then I'm not sure how you can achieve this serialisation process since you need the variant to know where to start reading the variant.

Also note that the data enum codec allows you specific custom sizes such as u64 so you could achieve byte alignment and therefore zero-copy using a prefix instead of a suffix size as well.

@lithdew
Copy link
Author

lithdew commented Mar 8, 2024

Right - all enum variants in this case are of the same size for C discriminated unions (the max size of all enum variants).

The offset codec sounds like a good solution - can’t wait to replace my current solution with it :).

@lorisleiva
Copy link
Contributor

lorisleiva commented Mar 12, 2024

Hey, just wanted to let you know I submitted a PR for the offsetCodec helper and, I have to say, it is pretty powerful haha.

Let me know what you think: #2294.

EDIT: oh and here's an example of a string codec that stores its size at the end of the buffer using absolute offsets.

it('offsets prefixed strings', () => {
const codec = string({
size: offsetCodec(
u8(),
() => -1,
() => 0,
),
});
expect(codec.encode('ABC')).toStrictEqual(b('41424303'));
expect(codec.decode(b('41424303'))).toBe('ABC');
});

@lorisleiva
Copy link
Contributor

Closing this issue since offsetCodec (#2294) can now be used to achieve C-style discriminated unions.

Copy link
Contributor

Because there has been no activity on this issue for 7 days since it was closed, it has been automatically locked. Please open a new issue if it requires a follow up.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Mar 22, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants