-
-
Notifications
You must be signed in to change notification settings - Fork 856
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Performance optimization opportunities in common pixel formats. #2232
Comments
ImageSharp is .NET 6+ now, correct? If you want to assign this to me, I can work on getting a fix up and take a peek at some of the other SIMD code for other .NET 6+ specific improvements. |
@tannergooding yes it's .NET6+ now. I have assigned you to this task, thanks! |
@tannergooding Any and all suggestions will be deeply appreciated 😄 |
Few questions...
For 1, I'm mostly asking as I can try and do several "smallish" PRs each targeting a specific type or I can do large PRs that try and cover changes more holistically across the repo. For 2, mostly asking what the preferred way of handling cases such as where For 3, naturally I'd like to show numbers actually indicating changes are wins and in a format expected by the repo. I didn't see any callouts in the contributing docs, but maybe I missed it. For 4, as an example there are a number of structs that contain 2/3/4 float fields (often exposed via auto-props). Due to the JIT specializing Vector2/3/4 but not user-defined structs today (hopefully I can get this fixed eventually, so all qualifying types get this same specialization), you will likely get better codegen by wrapping a single Additionally there is the potential to wrap There's also a slightly more complex consideration in that while Also some cases where |
Apologies for the slow reply, was offine this weekend. I'll try to summarise a response to each question for you.
Thanks again for your help here! |
Thanks a lot for the responses, they all make sense to me! |
Just wanted to give an update on this. I've been familiarizing myself with the codebase and the various SIMD code, making notes of possible improvements as I go. I'll be closing on my first house and moving over the next couple weeks so will endup taking a short break, but hope to have a PR up and a bullet list of potential additional improvements closer to the end of the month. |
Congratulations on the house! No worries, enjoy your break. Looking forward to seeing what you come up with! |
-- Am still working on this. Codegen wasn't doing what I expected so I spent some time fixing that up in the JIT (believe you've seen the PR already 😄). With the changes I've made, the codegen on my local changes is significantly better. So should provide some nice wins off the bat and even better gains once 8.0 comes around. |
Yeah, I saw that PR 😃 awesome stuff! NET 8 seems like it's going to allow massive improvements to the ImageSharp codebase. Simplified SIMD with out of the box ARM support will be a gamechanger. |
Also got a rewrite of Matrix4x4 and Matrix3x2 in: dotnet/runtime#80091 This resulted in perf improvements of 2x up to 48x and should have a huge positive impact on ImageSharp. There is also opportunity for me to rewrite/improve Plane and Quaternion, but I didn't see any broad impact in my initial profile captures. I plan on rerunning ImageSharp perf benchmarks once I have a |
Haha.... You're a whole new level of awesome. I was expecting a few small changes; you're rewriting the runtime for the benefit of all. Truly amazing! |
I think our ColorMatrix type can benefit from copying those changes. |
Prerequisites
DEBUG
andRELEASE
modeImageSharp version
v3 alpha +
Other ImageSharp packages and versions
NA
Environment (Operating system, version and so on)
NA
.NET Framework version
NA
Description
As described here there are several performance opportunities can be implemented in many of our pixel format types. This should be fairly low hanging fruit with good return.
Notably on .NET 6/7, you could make this even more efficient by doing something like:
This converts all 4 elements at once and then extracts the truncated bytes directly:
vpextrb
more in the future so it can be justvpextrb [rcx+2], xmm0, 0
instead ofvpextrb eax, xmm0, 0
followed bymov [rcx+2], al
.You can also optimize in .NET 6+ by directly using
Vector128.Create()
. This creates a method local constant and avoids the static initializer entirely:Steps to Reproduce
NA
Images
No response
The text was updated successfully, but these errors were encountered: