Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle weight aliases #490

Closed
wants to merge 1 commit into from
Closed

Conversation

jrruethe
Copy link

@jrruethe jrruethe commented Jan 21, 2025

This is a quick modification to the extract_lora.py script to utilize the aliases from the weight info that is loaded from the architecture JSON.

It is my attempt at a solution to #484 and provides an alternative to the hardcoding found in #483.

I have tested this on the following:

  • Qwen_Qwen2.5-0.5B
  • Qwen_Qwen2.5-1.5B
  • Qwen_Qwen2.5-3B
  • meta-llama_Llama-3.2-1B
  • meta-llama_Llama-3.2-3B
  • Skywork/Skywork-Reward-Llama-3.1-8B-v0.2 (This 8B model used "model.embed_tokens.weight" instead of "lm_head.weight", and this change accounts for the difference via the alias)

Tests:

  • Extraction works, without any "lm_head" errors.
  • Reapplying the extracted lora works.

Once I complete the above tests, I will clear the "draft".

Notes:
I still get messages stating the following:

Some weights of Qwen2ForCausalLM were not initialized from the model checkpoint at cognitivecomputations_Dolphin3.0-Qwen2.5-0.5B and are newly initialized: ['embed_tokens.weight', 'layers.0.input_layernorm.weight', 'layers.0.mlp.down_proj.weight' ...

However, after looking into it some more, I believe this message is not an issue. I think it is coming from this line due to essentially creating an empty model? I'm not 100% sure.

Interestingly, I still receive an error extracting the lora from this with specific message of "AttributeError: 'LlamaModel' object has no attribute 'lm_head'" which I find a bit bizarre. I got this message before my change too, so it is unrelated, but I'm a bit surprised this change didn't fix it.

@cg123
Copy link
Collaborator

cg123 commented Jan 25, 2025

Thanks for the PR! I ended up implementing a more general solution that should also handle tied_names properly in #496 - hopefully this hits everything you need. If not let me know and I'll revisit this.

Thanks again!

@cg123 cg123 closed this Jan 25, 2025
@jrruethe
Copy link
Author

Oh! Awesome, thank you so much @cg123!

One quick question, now that there is --save-module lm_head --save-module embed_tokens, if I have extracted any loras using the code from my PR, do I need to re-extract them with this update? Am I missing things? I have been primarily extracting loras as a way to "archive" finetunes to reduce disk space.

I appreciate the help!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants