-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Renderer Performance #545
Conversation
I can confirm that this fixed whatever was wrong on my M3 MacBook Pro. |
1d07172
to
a693814
Compare
Use orphaned "buffer update streaming" to eliminate synchronization delays, due to CPU->GPU latency, causing framerate slowdowns. This sends vertex data to the GPU via partial updates to a buffer and reallocates the buffer once it fills up, ensuring that no synchronization is ever needed. The buffer is sized to balance memory usage vs allocation rate, and the GL driver ensures that "orphaned" buffers are only destroyed when all GL commands using them are retired.
a693814
to
be59b88
Compare
Fixed Windows build. |
I tested on Linux, and there no issues (mine framerate is capped by 60 FPS by default). I'd request additional testings on Windows though. |
On my windows laptop (no dedicated GPU), FPS goes from ~60 to ~20 on the Bluedevil stage with this commit (debug build) |
What GPU does the laptop have? |
Thinkpad T480s, probably some Intel integrated chip. FPS drop is less dramatic on other campaign missions but still noticeable (60->50). I also get some screen tearing that I didnt have before the renderer rework |
Tearing is interesting. I can't think of anything in this merge that would directly affect that, but maybe the synchronization overhead caused some problems with vsync/frame atomicity at the driver level. |
Tearing already happened after #527, not new here though |
oh wait wow i thought this comment was on the already-committed PR. this one makes your perf worse? what was your laptop's FPS on 1.5? |
Speculating: Integrated GPU likely means shared VRAM. I wonder if that alters the effectiveness of the buffer streaming approach.
Is that right? I would hope that there would also be a significant drop between 1 and 2, otherwise I'd be pretty confused tbh. My first thought is maybe to play around with the allocated buffer size - if it's ignoring the |
yes, something like that. FPS in 1. was about 60, in 2. much less consistent, and 3. even lower. I'll do more testing and proper benchmark, it's important that we support lower-end devices correctly |
Definitely important, and definitely concerning. This will potentially be a difficult case to test/fix because I don't have any low-end machinery >_< If you could profile your performance or run it with Renderdoc or another introspection tool, that may give some clues. |
On the same device, both Linux and Windows release build gets 60FPS pretty consistently, so it's only about the Windows debug build. We've had slowdowns on Win debug before, IMO it's acceptable to ignore for the time being. |
Aha! Great find. This PR makes use of iterators instead of bare pointers, and MSVC's Thanks for merging! |
MSVC uses STL iterators debugging (https://learn.microsoft.com/en-us/cpp/standard-library/debug-iterator-support) which greatly reduces performance on iterating STL containers like std::array in Debug mode. |
I was not aware of that! Is there any way we could kindly ask MSVC to disable iterators debugging for this iterator in particular? |
|
#550 should hopefully mitigate this. @winterheart's suggestion of |
Pull Request Type
Description
Use orphaned "buffer update streaming" to eliminate synchronization delays, due to CPU->GPU latency, causing framerate slowdowns. This sends vertex data to the GPU via partial updates to a buffer and reallocates the buffer once it fills up, ensuring that no synchronization is ever needed. The buffer is sized to balance memory usage vs allocation rate, and the GL driver ensures that "orphaned" buffers are only destroyed when all GL commands using them are retired.
Related Issues
Fixes #543
Screenshots (if applicable)
Checklist