About Loss #48

joelulu · 2024-09-05T08:21:36Z

Thank you for the excellent work. I have two questions:

My understanding of distillation is that a student’s one inference step should be comparable to the teacher’s n inference steps. However, in both compute_distribution_matching_loss and compute_loss_fake, the true_unet and fake_unet only perform one-step denoising (for instance, if the generator is one step (399), the true_unet and fake_unet are also one step, assuming a random step of 540). Since the original true_unet also performs poorly with one-step denoising, why does this loss work?

I am trying to use DMD2 to distill a one-step SD cascade. Can I only use ### compute_distribution_matching_loss and compute_loss_fake?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

About Loss #48

About Loss #48

joelulu commented Sep 5, 2024

About Loss #48

About Loss #48

Comments

joelulu commented Sep 5, 2024