Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: refine loRA diffusers to flux conversion logic #7708

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

simpletrontdip
Copy link

Summary

This PR updates loRA Diffusers -> Flux conversion logic based on its original source:

Related Issues / Discussions

I couldn't load Hyper-FLUX.1-dev-Nsteps-lora.safetensors from https://huggingface.co/ByteDance/Hyper-SD via InvokeUI.

QA Instructions

  1. Install Hyper-FLUX 8 steps, type ByteDance/Hyper-SD in the search box.
  2. Load and run it with with Flux.dev flow, (to see it fail)
  3. Apply the changes
  4. Check the output
Screenshot 2025-02-28 at 15 01 56

Merge Plan

Apply the change only, it's a small one

Checklist

  • The PR has a short but descriptive title, suitable for a changelog
  • Tests added / updated (if applicable)
  • Documentation added / updated (if applicable)
  • Updated What's New copy (if doing a release after this PR)

@github-actions github-actions bot added python PRs that change python files backend PRs that change backend files labels Feb 28, 2025
Copy link
Collaborator

@RyanJDick RyanJDick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for digging into this!

To get it merged, we'll need to:

  1. Fix the shift/scale transformation.
  2. Add a unit test in test_flux_diffusers_lora_conversion_utils.py for this new LoRA format. See the other tests in that file for reference.

Comment on lines 90 to 94
for _key in values.keys():
# in SD3 original implementation of AdaLayerNormContinuous, it split linear projection output into shift, scale;
# while in diffusers it split into scale, shift. Here we swap the linear projection weights in order to be able to use diffusers implementation
scale, shift = values[_key].chunk(2, dim=0)
values[_key] = torch.cat([shift, scale], dim=0)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right to me. If I'm understanding correctly, in the case of a vanilla LoRA layer, we should only be flipping one of the LoRA components.

The required transformation would be a bit more involved for other LoRA variants (LoHA, LoKR, etc.), so I'm fine with only supporting vanilla LoRAs. But, we should assert that the result of any_lora_layer_from_state_dict() is a LoRALayer.

Copy link
Author

@simpletrontdip simpletrontdip Mar 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @RyanJDick thanks for spending time 👯
I have to confess, it is more complex than I expected, sorry for not asking the team before hand.

As my understanding

# for normal LoRA layer
delta_W = up @ down
W = W + delta_W

# for AdaLN in diffusers
W_prime = swap_shift_scale(W)
delta_W_prime = swap_shift_scale(delta_W)

# => We may need to add a custom LoRA layer to swap them in `get_weight`

class AdaLN_LoRALayer(LoRALayer):
   def get_weight(self, orig_weight: torch.Tensor) -> torch.Tensor:
       '''swap shift and scale before returning real weight'''
        weight = super().get_weight(orig_weight)
        scale, shift = weight.chunk(2, dim=0)
        return torch.cat([shift, scale], dim=0)

# we need to build and return this layer in our function

What do you think?

@github-actions github-actions bot added the python-tests PRs that change python tests label Mar 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend PRs that change backend files python PRs that change python files python-tests PRs that change python tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants