Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: operands could not be broadcast together with shapes (12582912,1) (3072,8192) #482

Open
bhuvneshsaini opened this issue Jan 6, 2025 · 1 comment

Comments

@bhuvneshsaini
Copy link

slices:

  • sources:
    • model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit
      layer_range:
      • 0
      • 28
    • model: taareshg/Llama-3.2-3B-Instruct-En-Hi-merge-200k
      layer_range:
      • 0
      • 28
        merge_method: slerp
        base_model: unsloth/Llama-3.2-3B-Instruct-bnb-4bit
        parameters:
        t:
    • filter: self_attn
      value:
      • 0
      • 0.5
      • 0.3
      • 0.7
      • 1
    • filter: mlp
      value:
      • 1
      • 0.5
      • 0.7
      • 0.3
      • 0
    • value: 0.5
      dtype: bfloat16

error:
[2025-01-06 12:05:07] [INFO] Merge configuration saved in /tmp/tmp26k63_vv/merged/config.yaml
[2025-01-06 12:05:07] [INFO] Creating repo bhuvneshsaini/unsloth-merge-3.2-3B-Instruct-bnb-4bit
[2025-01-06 12:05:07] [INFO] Repo created: https://huggingface.co/bhuvneshsaini/unsloth-merge-3.2-3B-Instruct-bnb-4bit
[2025-01-06 12:05:07] [INFO] Running mergekit-yaml config.yaml merge --copy-tokenizer --cuda --low-cpu-memory --allow-crimes --lora-merge-cache /tmp/tmp26k63_vv/.lora_cache
[2025-01-06 12:05:10] [INFO]
[2025-01-06 12:05:10] [INFO]
[2025-01-06 12:05:18] [INFO] Warmup loader cache: 0%| | 0/2 [00:00<?, ?it/s]�[A
[2025-01-06 12:05:18] [INFO]
[2025-01-06 12:05:20] [INFO] Warmup loader cache: 50%|█████ | 1/2 [00:07<00:07, 7.34s/it]�[A
[2025-01-06 12:05:20] [INFO]
[2025-01-06 12:05:20] [INFO] Warmup loader cache: 100%|██████████| 2/2 [00:10<00:00, 4.62s/it]�[A
[2025-01-06 12:05:20] [INFO] Warmup loader cache: 100%|██████████| 2/2 [00:10<00:00, 5.03s/it]
[2025-01-06 12:05:22] [INFO]
[2025-01-06 12:05:22] [INFO]
[2025-01-06 12:05:29] [INFO] Executing graph: 0%| | 0/1272 [00:00<?, ?it/s]�[A
[2025-01-06 12:05:29] [INFO]
[2025-01-06 12:05:30] [INFO] Executing graph: 0%| | 5/1272 [00:07<30:20, 1.44s/it]�[A
[2025-01-06 12:05:30] [INFO] Executing graph: 1%| | 14/1272 [00:07<11:05, 1.89it/s]
[2025-01-06 12:05:30] [INFO] Traceback (most recent call last):
[2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/bin/mergekit-yaml", line 8, in
[2025-01-06 12:05:30] [INFO] sys.exit(main())
[2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/click/core.py", line 1157, in call
[2025-01-06 12:05:30] [INFO] return self.main(*args, **kwargs)
[2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/click/core.py", line 1078, in main
[2025-01-06 12:05:30] [INFO] rv = self.invoke(ctx)
[2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/click/core.py", line 1434, in invoke
[2025-01-06 12:05:30] [INFO] return ctx.invoke(self.callback, **ctx.params)
[2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/click/core.py", line 783, in invoke
[2025-01-06 12:05:30] [INFO] return __callback(*args, **kwargs)
[2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/mergekit/options.py", line 82, in wrapper
[2025-01-06 12:05:30] [INFO] f(*args, **kwargs)
[2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/mergekit/scripts/run_yaml.py", line 47, in main
[2025-01-06 12:05:30] [INFO] run_merge(
[2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/mergekit/merge.py", line 96, in run_merge
[2025-01-06 12:05:30] [INFO] for _task, value in exec.run(quiet=options.quiet):
[2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/mergekit/graph.py", line 197, in run
[2025-01-06 12:05:30] [INFO] res = task.execute(**arguments)
[2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/mergekit/merge_methods/slerp.py", line 60, in execute
[2025-01-06 12:05:30] [INFO] slerp(
[2025-01-06 12:05:30] [INFO] File "/home/user/.pyenv/versions/3.10.15/lib/python3.10/site-packages/mergekit/merge_methods/slerp.py", line 137, in slerp
[2025-01-06 12:05:30] [INFO] dot = np.sum(v0 * v1)
[2025-01-06 12:05:30] [INFO] ValueError: operands could not be broadcast together with shapes (12582912,1) (3072,8192)
[2025-01-06 12:05:30] [ERROR] Command exited with code 1
[2025-01-06 12:05:30] [ERROR] Merge failed. Deleting repo as no model is uploaded.

@ngxson
Copy link

ngxson commented Jan 8, 2025

You should try using meta-llama/Llama-3.2-3B-Instruct instead of unsloth/Llama-3.2-3B-Instruct-bnb-4bit

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants