Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue merging a Lora model to a SANA transformer #2318

Open
4 tasks
frutiemax92 opened this issue Jan 10, 2025 · 6 comments
Open
4 tasks

Issue merging a Lora model to a SANA transformer #2318

frutiemax92 opened this issue Jan 10, 2025 · 6 comments

Comments

@frutiemax92
Copy link

frutiemax92 commented Jan 10, 2025

System Info

peft=0.14.0

Who can help?

@BenjaminBossan @sayakpaul

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder
  • My own task or dataset (give details below)

Reproduction

from diffusers import SanaPipeline, SanaPAGPipeline, SanaTransformer2DModel
from peft import PeftModel

transformer = SanaTransformer2DModel.from_pretrained("frutiemax/twistedreality-sana-1600m-1024px")
print(transformer)
peft_model = PeftModel.from_pretrained(transformer, '0')
model = peft_model.merge_and_unload()

Expected behavior

I've trained a Lora model with PEFT on a SANA checkpoint. I can train and inference using the PEFT model. However, when I try to merge the Lora to the base checkpoint, I encounter a shape mismatch. I've attached the Lora model with a rank 4.

image

0.zip

@sayakpaul
Copy link
Member

Please make the code fully reproducible.

@frutiemax92
Copy link
Author

Updated the code with missing imports.

@BenjaminBossan
Copy link
Member

Could you please show how you initialized the model (training code not necessary) and how you saved it? Also, do you know if the frutiemax/twistedreality-sana-1600m-1024px corresponds to the official Sana model? What is different?

@frutiemax92
Copy link
Author

I've investigated this a bit more and found out only the target module 'conv_depth' makes it crash, PEFT with the other modules work fine.

from diffusers import SanaTransformer2DModel
from peft import LoraConfig, get_peft_model

transformer = SanaTransformer2DModel.from_pretrained("Efficient-Large-Model/Sana_1600M_1024px_diffusers", subfolder='transformer')
lora_config = LoraConfig(r=4, target_modules=['conv_depth'], lora_alpha=4)
model = get_peft_model(transformer, lora_config)
model = model.merge_and_unload()
model.save_pretrained('merged_model')

@frutiemax92
Copy link
Author

I also want to add that calling the forward function with lokr and loha also crash on the conv_depth module, but not with the regular lora algorithm.

@BenjaminBossan
Copy link
Member

Thanks for the snippet. The LoRA adapter does not have the right shape because we're not honoring the groups argument of this Conv2d layer. Usually, it's 1, so it doesn't matter, but here it's 11200. This issue was already reported in #2153 but still awaiting a PR.

I also want to add that calling the forward function with lokr and loha also crash on the conv_depth module, but not with the regular lora algorithm.

Most likely, the same issue is the cause here. I'm not sure why forward works for LoRA but I would not rely on the result being correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants