Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implementation of WaitUntil.CoroutineEnds #821

Closed

Conversation

SirePi
Copy link
Member

@SirePi SirePi commented May 19, 2020

This is an implementation of WaitUntil.CoroutineEnds as described here.

@SirePi SirePi requested a review from Barsonax May 19, 2020 17:32
Source/Core/Duality/Utility/Coroutines/WaitUntil.cs Outdated Show resolved Hide resolved
Source/Core/Duality/Utility/Coroutines/WaitUntil.cs Outdated Show resolved Hide resolved
Source/Core/Duality/Utility/Coroutines/WaitUntil.cs Outdated Show resolved Hide resolved
Source/Core/Duality/Utility/Coroutines/WaitUntil.cs Outdated Show resolved Hide resolved
@Barsonax
Copy link
Member

Scanned through the changes a bit. Looks pretty good already but I think we can generalize it a bit more so we can use this for more than just a coroutineends feature.

Also I want to measure the performance of this again.

@Barsonax
Copy link
Member

Barsonax commented May 19, 2020

As expected some allocations are happening. Seems to be about 144 bytes per call which does impact performance quite a bit.

// * Summary *

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.836 (1909/November2018Update/19H2)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
  [Host]     : .NET Framework 4.8 (4.8.4180.0), X64 RyuJIT
  DefaultJob : .NET Framework 4.8 (4.8.4180.0), X64 RyuJIT


|                Method |      N |             Mean |          Error |         StdDev |     Gen 0 |     Gen 1 |    Gen 2 |  Allocated |
|---------------------- |------- |-----------------:|---------------:|---------------:|----------:|----------:|---------:|-----------:|
| CallEndlessSubRoutine |      1 |         72.61 ns |       0.835 ns |       0.781 ns |    0.0869 |         - |        - |      144 B |
| CallEndlessSubRoutine |     10 |        672.92 ns |       3.946 ns |       3.498 ns |    0.8698 |         - |        - |     1444 B |
| CallEndlessSubRoutine |    100 |      6,785.61 ns |      32.611 ns |      27.232 ns |    8.6975 |         - |        - |    14443 B |
| CallEndlessSubRoutine |   1000 |     78,150.10 ns |     691.155 ns |     646.507 ns |   56.3965 |   14.0381 |        - |   144427 B |
| CallEndlessSubRoutine |  10000 |  1,027,735.81 ns |  11,053.970 ns |  10,339.890 ns |  230.4688 |  115.2344 |        - |  1444252 B |
| CallEndlessSubRoutine | 100000 | 29,019,037.50 ns | 248,125.998 ns | 232,097.212 ns | 2625.0000 | 1093.7500 | 343.7500 | 14442578 B |

Its important to note that this only happens when you use the delegate. There is also a performance hit without because the struct is now bigger but its much smaller.

@Barsonax
Copy link
Member

Barsonax commented May 22, 2020

Not yet satisfied with the performance as there is alot of allocations going on (closures, boxing) but I don't think there's a way around this unless C# adds something like a yield all or yield foreach keyword.

In that regard writing this does exactly the same:

foreach(var step in SubCoroutine()) {
  yield return step;
}

This should prevent most of the allocations (there is still a enumerator being allocated) and makes the WaitUntil struct smaller.

It is however more verbose and less readable.

To give some numbers to show how big this difference is:

// * Summary *

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.836 (1909/November2018Update/19H2)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
  [Host]     : .NET Framework 4.8 (4.8.4180.0), X64 RyuJIT
  DefaultJob : .NET Framework 4.8 (4.8.4180.0), X64 RyuJIT


|                              Method |   N |        Mean |     Error |    StdDev |  Gen 0 | Gen 1 | Gen 2 | Allocated |
|------------------------------------ |---- |------------:|----------:|----------:|-------:|------:|------:|----------:|
| CallEndlessSubRoutine_CoroutineEnds |   1 |    70.04 ns |  0.698 ns |  0.653 ns | 0.0869 |     - |     - |     144 B |
|       CallEndlessSubRoutine_Foreach |   1 |    51.59 ns |  0.418 ns |  0.391 ns | 0.0241 |     - |     - |      40 B |
| CallEndlessSubRoutine_CoroutineEnds | 100 | 6,829.83 ns | 73.671 ns | 68.912 ns | 8.6975 |     - |     - |   14443 B |
|       CallEndlessSubRoutine_Foreach | 100 | 4,788.00 ns | 35.413 ns | 33.125 ns | 2.4109 |     - |     - |    4012 B |

Thats already a big difference, especially in allocations.

But thats not all because in both benchmarks the WaitUntil struct is bigger due to the added field for the delegate. If we remove it the numbers change into this:

// * Summary *

BenchmarkDotNet=v0.12.1, OS=Windows 10.0.18363.836 (1909/November2018Update/19H2)
AMD Ryzen 9 3900X, 1 CPU, 24 logical and 12 physical cores
  [Host]     : .NET Framework 4.8 (4.8.4180.0), X64 RyuJIT
  DefaultJob : .NET Framework 4.8 (4.8.4180.0), X64 RyuJIT


|                        Method |   N |        Mean |     Error |    StdDev |  Gen 0 | Gen 1 | Gen 2 | Allocated |
|------------------------------ |---- |------------:|----------:|----------:|-------:|------:|------:|----------:|
| CallEndlessSubRoutine_Foreach |   1 |    43.01 ns |  0.702 ns |  0.657 ns | 0.0193 |     - |     - |      32 B |
| CallEndlessSubRoutine_Foreach | 100 | 3,846.48 ns | 38.659 ns | 34.270 ns | 1.9302 |     - |     - |    3210 B |

So this is gonna be a question what we will value more. The readability of CoroutineEnds or the performance of just foreaching over the IEnumerable.

With deeper nesting the differences will only get bigger so really doubting if we should add this utility method.

@SirePi
Copy link
Member Author

SirePi commented May 22, 2020

If this is how it is, then I believe it's better to just remove the method altogether.. there are many ways to deal with it in case.
You can take the foreach route, or you can go with the full coroutine-declaring approach that even lets you pause or resume it.

I'll just close it and add a note in the docs that this there is the possibility to integrate a Coroutine inside another

@SirePi SirePi closed this May 22, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

2 participants