AMD is way behind #807
Replies: 3 comments 6 replies
-
It's not that cuda is faster, it's far more complete and stable. So all "extra" code like memory-cross optimization (xformers or SDP) works perfectly in cuda, but not on ROCm - and that results in a big performance and memory savings boosts. |
Beta Was this translation helpful? Give feedback.
-
First off, I have a 6700xt with 12GB of vram, so I do have a slightly better card. That being said I can generate MUCH higher res images than that, and I'm getting around 7.6it/s at 512x512 using Euler a, either my card is significantly faster than yours or you've got an issue with your settings/setup. Can you post screenshots of your settings? I can make some recommendations for settings, specifically the stable diffusion and compute settings screens. |
Beta Was this translation helpful? Give feedback.
-
I'll list my weird performance quirk findings here, fresh install and settings above from iDeNoh. token merging:
-> token merging does either not work or does not impact generation speed at all hi-res fix (latent)
-> abysmal performance here cross attention
-> SDP, InvokeAI and Doggettx on the same level |
Beta Was this translation helpful? Give feedback.
-
Since SD released I was always kind of content with the performance of my 8GB RX 6650XT on Ubuntu.
I just installed my old 6GB Geforce GTX 1060 on my second slot to use another program that has no AMD support yet.
Now, I dont know if something in my AMD setup was always messed up but I am absolutely blown away by the VRAM management of the nvidia card.
I can do highres fix up to 1400x788 with a lot of VRAM to spare while the AMD card runs out of memory at around 1100x620.
Speed of course is a bit slower (512x512 1.2it/s) vs (512x512 3.2it/s) but is rocm really THAT MUCH worse than CUDA?
SD on AMD is basically constantly walking on eggshells to not run out of VRAM, while the 6 year old Nvidia GPU is just stable and solid.
Beta Was this translation helpful? Give feedback.
All reactions