You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
the way I understand it, disco supports multi-clip by picking a different perceptor from its active pool each step, so instead of throwing them all at the step, you just throw one. or maybe they do a 1/3rd power gradient update for each perceptor?
anyway, either of those approaches -- or both -- could be used to incorporate multiple VQGANs as well, or depth models, or whatever.
I wonder if I could alternate stepping a diffusion denoising with a prediction from a VQGAN guided in parallel via the same prompt etc. I guess the issue here is the reduced diversity, huh. which is sorta the issue with vqgan in general. goddamn it I really need to add diffusion.
but yeah. alternative version of multi-clip, then experiment with multi-VQGAN. (aww shit, maybe instead of keeping it strictly at 50:50 infulence, I could add a parameter for each pixel and fit that with the image starting with a really tiny learning rate, as a learned lerping weight for that pixel (i.e. learned model responsibilities)
The text was updated successfully, but these errors were encountered:
the way I understand it, disco supports multi-clip by picking a different perceptor from its active pool each step, so instead of throwing them all at the step, you just throw one. or maybe they do a 1/3rd power gradient update for each perceptor?
anyway, either of those approaches -- or both -- could be used to incorporate multiple VQGANs as well, or depth models, or whatever.
I wonder if I could alternate stepping a diffusion denoising with a prediction from a VQGAN guided in parallel via the same prompt etc. I guess the issue here is the reduced diversity, huh. which is sorta the issue with vqgan in general. goddamn it I really need to add diffusion.
but yeah. alternative version of multi-clip, then experiment with multi-VQGAN. (aww shit, maybe instead of keeping it strictly at 50:50 infulence, I could add a parameter for each pixel and fit that with the image starting with a really tiny learning rate, as a learned lerping weight for that pixel (i.e. learned model responsibilities)
The text was updated successfully, but these errors were encountered: