Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarification on 3-Step Training Approach and Commands for Uni-MoE v2 #9

Open
Bhagyashreet20 opened this issue Jun 25, 2024 · 2 comments

Comments

@Bhagyashreet20
Copy link

I like the three step innovative training approach to train the MLLMs. This intrigued me more and I was going through the scripts trying to replicate 3 step training technique to train my own model. However, I have few queries.

  1. is it possible to replicate all three training steps with the scripts in uni-moe-v2 folder?
  2. Could you share the command to train uni-moe-v2-speech as there are only inference and eval scripts?
  3. relating to the 3 step training approach and the given model checkpoints, Uni-MoE 8-expert base is the result of step1, Uni_MoE 8-expert experts model after step 2 and Uni_MoE 8-expert finetune model is the model after step 3. Is my understanding correct?
@expapa
Copy link
Collaborator

expapa commented Jun 26, 2024

Thanks for your attention and support to our model! Here's some replies, hope they are helpful for you:

  1. sry we are not releasing the first two stage training script, but these stages can be done by removing the MoE structure from the code.
  2. Sure, the script will be uploaded soon, check it out.
  3. Actually the projector weight and qformer weight have all been changed during the first, second and third stage, so Uni-MoE 8-expert base is the base model we train all our stages from, Uni_MoE 8-expert experts model are the stage 2 result which contains MLPs from the stage 2 models, Uni_MoE 8-expert finetune model is the lora weights and the actual qformer and projector weight for the MoE model.

@Bhagyashreet20
Copy link
Author

cool. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants