Support for C-style discriminated unions and alignment/padding. #2283

lithdew · 2024-03-08T18:24:58Z

There exists Solana programs that encode/decode accounts whose data is represented by a C-style discriminated union.

Discriminated unions in C contain their numeric discriminator (of any fixed size, i.e. u8, u16, u32) in their byte representation after their fields, rather than before their fields like in i.e. Borsh.

The size of a C discriminated union is fixed to the max of all of its fields + discriminator length.

There may optionally be padding of fields such that each field is padded to the size of a pointer (for Solana BPF VM case, 8 bytes).

The motivation for representing account data as C types is that they are inherently zero-copy and fixed in byte length, such that encoding/decoding them costs very little instructions in comparison to encoding/decoding with a codec scheme like Borsh.

getDataEnumCodec may be modified to contain options to allow for its discriminator to be appended after all its payload fields, and to support alignment/padding.

getStructCodec may be modified to support alignment/padding.

I am currently writing custom implementations of getDataEnumCodec to support C-style discriminated unions and will post them when I have verified that they work as intended. I am still in the middle of checking for gotchas regarding alignment/padding for my own implementation.

lorisleiva · 2024-03-08T18:49:43Z

I'm actually gonna start working on two new composable codecs soon one of which is the "offset" codec allowing you to temporarily move the cursor up or down.

This means, if all your enum variants have the same size, then you can use that size to offset the cursor and get the enum variant.

However, if your enum variants don't all share the same size, then I'm not sure how you can achieve this serialisation process since you need the variant to know where to start reading the variant.

Also note that the data enum codec allows you specific custom sizes such as u64 so you could achieve byte alignment and therefore zero-copy using a prefix instead of a suffix size as well.

lithdew · 2024-03-08T18:53:42Z

Right - all enum variants in this case are of the same size for C discriminated unions (the max size of all enum variants).

The offset codec sounds like a good solution - can’t wait to replace my current solution with it :).

lorisleiva · 2024-03-12T13:23:02Z

Hey, just wanted to let you know I submitted a PR for the offsetCodec helper and, I have to say, it is pretty powerful haha.

Let me know what you think: #2294.

EDIT: oh and here's an example of a string codec that stores its size at the end of the buffer using absolute offsets.

solana-web3.js/packages/codecs-strings/src/__tests__/string-test.ts

Lines 101 to 111 in 9b9a133

    
           it('offsets prefixed strings', () => { 
        
               const codec = string({ 
        
                   size: offsetCodec( 
        
                       u8(), 
        
                       () => -1, 
        
                       () => 0, 
        
                   ), 
        
               }); 
        
               expect(codec.encode('ABC')).toStrictEqual(b('41424303')); 
        
               expect(codec.decode(b('41424303'))).toBe('ABC'); 
        
           });

lorisleiva · 2024-03-14T10:58:28Z

Closing this issue since offsetCodec (#2294) can now be used to achieve C-style discriminated unions.

github-actions · 2024-03-22T08:02:37Z

Because there has been no activity on this issue for 7 days since it was closed, it has been automatically locked. Please open a new issue if it requires a follow up.

lithdew added the enhancement New feature or request label Mar 8, 2024

lithdew changed the title ~~Support for C-style discriminated unions and padding.~~ Support for C-style discriminated unions and alignment/padding. Mar 8, 2024

lorisleiva closed this as completed Mar 14, 2024

github-actions bot locked as resolved and limited conversation to collaborators Mar 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for C-style discriminated unions and alignment/padding. #2283

Support for C-style discriminated unions and alignment/padding. #2283

lithdew commented Mar 8, 2024 •

edited

Loading

lorisleiva commented Mar 8, 2024

lithdew commented Mar 8, 2024

lorisleiva commented Mar 12, 2024 •

edited

Loading

lorisleiva commented Mar 14, 2024

github-actions bot commented Mar 22, 2024

Support for C-style discriminated unions and alignment/padding. #2283

Support for C-style discriminated unions and alignment/padding. #2283

Comments

lithdew commented Mar 8, 2024 • edited Loading

lorisleiva commented Mar 8, 2024

lithdew commented Mar 8, 2024

lorisleiva commented Mar 12, 2024 • edited Loading

lorisleiva commented Mar 14, 2024

github-actions bot commented Mar 22, 2024

lithdew commented Mar 8, 2024 •

edited

Loading

lorisleiva commented Mar 12, 2024 •

edited

Loading