Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Moe merging failed #452

Open
PsoriasiIR opened this issue Nov 4, 2024 · 2 comments
Open

Moe merging failed #452

PsoriasiIR opened this issue Nov 4, 2024 · 2 comments

Comments

@PsoriasiIR
Copy link

PsoriasiIR commented Nov 4, 2024

I encountered an error while trying to merge two Qwen-based lora models using a mixture of experts (MoE) configuration with qwen architecture. I’m working with a phi2_moe2.yml configuration file, but the system throws an error related to a missing field (merge_method).

Configuration and Setup

I am using the following configuration yml:

base_model: CMLM/ZhongJing-2-1_8b
gate_mode: hidden # one of "hidden", "cheap_embed", or "random"
#dtype: float16 # output dtype (float32, float16, or bfloat16)
experts:
  - source_model: CMLM/ZhongJing-2-1_8b
    positive_prompts: []
  - source_model: Qwen2.5-1.5B-Instruct
    positive_prompts: []

When I run this setup, I get the following error:

[2024-11-04 18:51:10] [ERROR] Invalid yaml 1 validation error for MergeConfiguration
merge_method
  Field required [type=missing, input_value={'base_model': 'CMLM/ZhongJing-2-1_8b', 'gate_mode': 'hidden', 'experts': [{'source_model': 'CMLM/ZhongJing-2-1_8b', 'positive_prompts': []}, {'source_model': 'Qwen2.5-1.5B-Instruct', 'positive_prompts': []}]}]

Attempted Solutions
I suspect adding merge_method might resolve the issue, but I’m not sure what options are available for this field. I would appreciate guidance on:

Complete yml file for qwen moe merge_method
Documentation or examples: Are there any detailed examples or documentation that explain each field in the YAML configuration for MoE?
Additional Context
First model: CMLM/ZhongJing-2-1_8b
Second model: Qwen2.5-1.5B-Instruct

Thank you for your assistance!

@cg123
Copy link
Collaborator

cg123 commented Nov 4, 2024

It looks like you're using the mergekit-yaml command. For this type of config you want to use mergekit-moe.

In addition, this particular merge probably won't work - the two models you are looking at aren't the same size, so they will not be compatible.

@PsoriasiIR
Copy link
Author

PsoriasiIR commented Nov 6, 2024

I hope this message finds you well. Specifically, I have the following models:

Base Model: CMLM/ZhongJing-2-1_8b
Fine-Tuned Model: CMLM/ZhongJing-2-1_8b_finetuned based on Qwen-1.5-1.8B-Chat

Current Challenge: When attempting to merge these models using a YML configuration, I continue encounter the error.

Could you provide an example of a correctly structured YML file for merging these models? Despite following available guidelines, attempts to merge via your space result in errors.

Attempted Configuration: Here's the YML configuration I used:

yml

base_model: CMLM/ZhongJing-2-1_8b
gate_mode: hidden
dtype: bfloat16
experts:
  - source_model: CMLM/ZhongJing-2-1_8b
    positive_prompts:
      - "Human: 请从中医角度分析以下症状。\nAssistant: 好的,我会从中医理论出发,通过望闻问切的方法进行分析。"
      - "Human: 这些症状在中医理论中属于什么证型?\nAssistant: 让我根据中医辨证论治的原则来分析。"
      - "请解释一下中医的阴阳五行理论如何解释这个症状。"
      - "从中医角度来看,这些食材的性质和功效是什么?"
      - "这些中药的配伍原则是什么?"
    negative_prompts:
      - "What's the molecular mechanism of this drug?"
      - "Please explain the pathophysiology of this condition."
  - source_model: Qwen-1.5-1.8B-Chat
    positive_prompts:
      - "Based on modern medical research, what's the diagnosis?"
      - "What are the evidence-based treatment options for this condition?"
      - "Please explain the pathophysiological mechanism."
      - "What laboratory tests should be ordered?"
      - "According to clinical guidelines, what's the recommended treatment protocol?"
    negative_prompts:
      - "从阴阳五行的角度分析"
      - "请解释一下这个症状的中医证型"

Thank you very much for your time and assistance. I look forward to your guidance to resolve this merging issue effectively.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants