Change underlying storage to an actual union #72

schveiguy · 2022-08-29T15:39:28Z

SumType does this and I think because of this, it has more accurate semantics than TaggedUnion, and also needs less machinery around the lifetime issues.

I'm putting this issue report here, because I think it's a good goal to have that I think unlocks some other issues that need fixing, specifically #63 and #44.

s-ludwig · 2022-08-29T15:56:53Z

Do you have specific semantics in mind or lifetime management that could be simplified? Not having GC-scanned void[] storage in the case that none of the types contain pointers is the only thing that I can think of right now.

But it would probably make the code a bit easier to read in any case. There had been issues with unused union parts not getting overwritten that produced false pointers, but those appeared to have gone, judging by vibe.d's Json implementation, which already switched back to union.

schveiguy · 2022-08-29T16:22:05Z

Instead of using memory tricks, and having to remember to call things that the compiler does naturally, we can let the compiler just do its thing. I have a suspicion that there are some shortcuts that might not be correct, but I haven't proven it yet. I found it odd that when I just added the full union to the Dummy union, it started complaining about how e.g. opAssign couldn't be @nogc.

It also paves the way for CT usage.

s-ludwig · 2022-08-30T08:39:52Z

But all the lifetime management (construction, destruction, copy construction/postblit) still needs to be done manually. And opAssign needs a valid value to overwrite, so either a bit-copy of T.init needs to be done in advance, or we need an approach similar to the current emplace one.

(Anyway, I don't really argue against this, we should definitely do it)

schveiguy · 2022-08-30T14:10:18Z

Looking at std.sumtype, @pbackus is destroying the current value, and then assigning the union by constructing a new one (which I assume blits the proper .init value first). I don't think he's calling any postblit/etc manually:
https://github.com/dlang/phobos/blob/9e42570bc28be1452995d288f9667f310df0becd/std/sumtype.d#L620

The ctors look better too:
https://github.com/dlang/phobos/blob/9e42570bc28be1452995d288f9667f310df0becd/std/sumtype.d#L358

In general, the machinery just looks much more straightforward and simple.

I'd like to see a similar approach here. Let the compiler do the work it knows how to do. It also gives the compiler more info about what is going on. When the compiler sees void[N] I think it just ignores all lifetime issues.

s-ludwig · 2022-09-02T07:49:04Z

I don't know, maybe its because I know the code, but I really think the main difference comes down to certain simplifications, such as always doing destroy+blit assign instead of using the copy constructor/postblit of the target type, and not so much to how the storage is defined. I don't see much in terms of actual simplification or compiler assistance (struct alignment definitely is, though).

BTW, I just discovered that I started a WIP branch for this already: https://github.com/s-ludwig/taggedalgebraic/tree/union_storage

schveiguy · 2022-09-02T13:42:11Z

Simplification should be a goal. If TU is storing a union and not void[N], it should utilize the compiler's in-built abilities to ensure lifetime is done correctly. It may not be automatic, it's still a union, but it should be utilized.

I have no objection to using postblit/copy vs. destroy+blit, so long as it's not a manual process.

I'm reminded of my newaa project where I reimplemented the built in associative arrays as a library type. I don't do any special mechanisms for the Entry struct to simulate what the compiler does naturally. Compare to the actual array runtime, which is insanely complex with all kinds of casts and machinery to replace what the compiler should be doing.

I'm not saying this is a requirement, nor that I have a specific vision in mind. But I'd like to have that kind of "aha, I get how this works" feeling when looking at the code, instead of "I have no idea how this works... this looks like black magic... this looks like a compiler workaround".

BTW, I just discovered that I started a WIP branch for this already

Nice! I'll take a look when I have some time.

s-ludwig · 2022-09-02T13:54:42Z

I have no objection to using postblit/copy vs. destroy+blit, so long as it's not a manual process.

But it is manual in both cases. I feel like I'm obviously missing something here, but unless I'm terribly mistaken, unions are not performing any lifetime functionality on their own. The differences AFAICS are just automatic alignment, automatic avoidance of GC scanning, as well as direct member access vs. a cast.

schveiguy · 2022-09-02T14:10:18Z

It has to be done via direct member access. But e.g. this kind of stuff is hard to comprehend why it's needed:

taggedalgebraic/source/taggedalgebraic/taggedunion.d

Lines 793 to 817 in c40e23c

    
           package void rawEmplace(T)(void[] dst, ref T src) 
        
           { 
        
           	T[] tdst = () @trusted { return cast(T[])dst[0 .. T.sizeof]; } (); 
        
           	static if (is(T == class)) { 
        
           		tdst[0] = src; 
        
           	} else { 
        
           		import std.conv : emplace; 
        
           		emplace!T(&tdst[0]); 
        
           		rawSwap(tdst[0], src); 
        
           	} 
        
           } 
        
           // std.algorithm.mutation.swap sometimes fails to compile due to 
        
           // internal errors in hasElaborateAssign!T/isAssignable!T. This is probably 
        
           // caused by cyclic dependencies. However, there is no reason to do these 
        
           // checks in this context, so we just directly move the raw memory. 
        
           package void rawSwap(T)(ref T a, ref T b) 
        
           @trusted { 
        
           	void[T.sizeof] tmp = void; 
        
           	void[] ab = (cast(void*)&a)[0 .. T.sizeof]; 
        
           	void[] bb = (cast(void*)&b)[0 .. T.sizeof]; 
        
           	tmp[] = ab[]; 
        
           	ab[] = bb[]; 
        
           	bb[] = tmp[]; 
        
           }

This is where I started trying to fix #63 and where it started complaining about @nogc no longer being valid.

Maybe I'm misunderstanding the complexity. The whole thing is somewhat confusing to me. Why in some cases it's a void[N], and in others it's a union between the first member and void[N]? This makes weird problems happen when your first member is of a certain type (but I suspect there are similar issues with SumType).

It would be nice if the storage is U fields; Kind tag; and that's it.

s-ludwig · 2022-09-02T14:17:37Z

Okay, but using m_data.trustedGet!T = ...; should be more or less exactly the same as m_data.field_0 = ...; in the union case. The rawEmplace was just a workaround for issues with non-copyable types and bogus complaints about internal pointers by swap (I don't remember the details right now). But generally, this could be replaced by a very similar approach to the the SumType code.

edit: CTFE is another valid point though!

schveiguy · 2022-09-02T15:11:57Z

Okay, but using m_data.trustedGet!T = ...; should be more or less exactly the same as m_data.field_0 = ...;

But that's not what happens, these are the paths for all construction and opAssign:

taggedalgebraic/source/taggedalgebraic/taggedunion.d

Line 74 in c40e23c

rawSwap(this, other);

taggedalgebraic/source/taggedalgebraic/taggedunion.d

Line 322 in c40e23c

m_data.rawEmplace(value);

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change underlying storage to an actual union #72

Change underlying storage to an actual union #72

schveiguy commented Aug 29, 2022

s-ludwig commented Aug 29, 2022

schveiguy commented Aug 29, 2022

s-ludwig commented Aug 30, 2022

schveiguy commented Aug 30, 2022

s-ludwig commented Sep 2, 2022

schveiguy commented Sep 2, 2022

s-ludwig commented Sep 2, 2022

schveiguy commented Sep 2, 2022

s-ludwig commented Sep 2, 2022 •

edited

Loading

schveiguy commented Sep 2, 2022

Change underlying storage to an actual union #72

Change underlying storage to an actual union #72

Comments

schveiguy commented Aug 29, 2022

s-ludwig commented Aug 29, 2022

schveiguy commented Aug 29, 2022

s-ludwig commented Aug 30, 2022

schveiguy commented Aug 30, 2022

s-ludwig commented Sep 2, 2022

schveiguy commented Sep 2, 2022

s-ludwig commented Sep 2, 2022

schveiguy commented Sep 2, 2022

s-ludwig commented Sep 2, 2022 • edited Loading

schveiguy commented Sep 2, 2022

s-ludwig commented Sep 2, 2022 •

edited

Loading