Bug: New DeepSeek-R1-Distill-Qwen models do not load #684

Eudox67 · 2025-01-21T22:39:18Z

Contact Details

What happened?

When attempting to load a DeepSeek-R1-DeepSeek-Distill-Qwen-GGUF model, llamafile fails to load the model -- any of 1.5b, 7b, 14b, or 32b. This occurs using llamafiler, llamafile, or as a .llamafile conversion under traditional or --v2.

Version

llamafile v0.9.0

What operating system are you seeing the problem on?

Linux

Relevant log output

llama_model_load: error loading model: error loading model vocabulary: unknown pre-tokenizer type: 'deepseek-r1-qwen'
llama_load_model_from_file: failed to load model
llama_init_from_gpt_params: error: failed to load model 'DeepSeek-R1-Distill-Qwen-14B-Q8_0.gguf'
{"function":"load_model","level":"ERR","line":463,"model":"DeepSeek-R1-Distill-Qwen-14B-Q8_0.gguf","msg":"unable to load model","tid":"12265824","timestamp":1737498365}

--v2
DeepSeek-R1-Distill-Qwen-14B-Q8_0.gguf: failed to load model

quantumalchemy · 2025-01-22T14:09:30Z

Yep not loading any DeepSeek Distill -- vocabulary: unknown pre-tokenizer type: 'deepseek-r1

sholtomaud · 2025-01-23T05:28:41Z

Serg Gini — Yesterday at 3:19 AM
Most of the model require explicit support actually.
ggml-org/llama.cpp#11310
ggml-org/llama.cpp#11324

most probably llamafile just need update on master from llama.cpp

BradHutchings · 2025-01-23T22:50:36Z

I was able to build my own GGUF and get llamafile to load it. It will be in my Brad's LLMs repo "soon". Watch for it in the models folder.

https://huggingface.co/bradhutchings/Brads-LLMs

quantumalchemy · 2025-01-24T14:50:06Z

yeah thanks but I want to use my own gguf - will wait for the update - hope soon
so how did you fix your llm so it works?

Eudox67 · 2025-01-24T20:37:17Z

Also:

Architecture	Model Size
openelm	1.08

I need 14b or 32b and want rocm gpu offload optimization for a Linux server. May have to get complicated and start using SGLang.

evangineer · 2025-01-26T22:07:53Z

This has been fixed upstream:
ggml-org/llama.cpp@ec7f3ac

Eudox67 · 2025-01-27T18:17:20Z

This has been fixed upstream: ggerganov/llama.cpp@ec7f3ac

Have you been able to compile it with llamafile?

quantumalchemy · 2025-01-29T14:38:05Z

someone needs to fork this thing.. just read (discord) the lead dev just took a job from google.. of course ; >.. don't blame her
go get paid! you deserve it! - thanks for creating this thing but needs a fork to update - its beyond my ken

Xydane · 2025-01-29T18:39:04Z

I've fixed this in pull request #687

cjpais · 2025-01-30T00:08:18Z

#687 just got merged! please give it a try

xor2003 · 2025-02-06T19:30:06Z

works with DeepSeek-R1-Distill-Qwen-1.5B-Q8_0.gguf for me

stonez56 · 2025-02-09T05:00:57Z

I don't know how to compile from source file.
Is there a method to create a JSON or YAML file to allow adding custom model files in the future?

DonaldCMLIN · 2025-02-14T02:18:17Z

Might I know, is there any solution
to fix
llamafile.exe , (v0.9), loading model: error loading model vocabulary: unknown pre-tokenizer type: 'deepseek-r1-qwen'

Although many professionals have mentioned upgrading to a new version of llama.cpp,
it remains unclear how to integrate and build it with a new llamafile.

cjpais · 2025-02-14T05:14:06Z

It's not in the main release yet, you will have to compile from source for now

DarkTyger · 2025-02-24T03:43:27Z

git clone https://github.com/Mozilla-Ocho/llamafile 
cd llamafile
make -j$(nproc)
make -j$(nproc) install PREFIX=$HOME/bin/llamafile

Error:

install: cannot stat 'o//stable-diffusion.cpp/main': No such file or directory
make: *** [Makefile:62: install] Error 1

Also:

git clone https://github.com/Mozilla-Ocho/llamafile 
cd llamafile
rm -rf llama.cpp
git clone https://github.com/ggerganov/llama.cpp
make -j$(nproc)

Error:

In file included from whisper.cpp/whisper.h:4:
whisper.cpp/ggml_extend.hpp:4:10: fatal error: 'llama.cpp/ggml-alloc.h' file not found
    4 | #include "llama.cpp/ggml-alloc.h"
      |          ^~~~~~~~~~~~~~~~~~~~~~~~
.cosmocc/4.0.2/bin/cosmoc++ -O2 -g -fexceptions -ffunction-sections -fdata-sections -mclang -DGGML_MULTIPLATFORM -frtti -std=gnu++23 -frtti -Wno-deprecated-declarations -iquote. -mcosmo -DGGML_MULTIPLATFORM -Wno-attributes -DLLAMAFILE_DEBUG  -Xx86_64-mtune=znver4 -c -o o//whisper.cpp/server.o whisper.cpp/server.cpp
In file included from whisper.cpp/mic2raw.cpp:19:
In file included from whisper.cpp/whisper.h:4:
whisper.cpp/ggml_extend.hpp:4:10: fatal error: 'llama.cpp/ggml-alloc.h' file not found
    4 | #include "llama.cpp/ggml-allollamafile/tokenize.cpp:27:10c.: h"fatal error: 'llama.cpp/llama.h' file not found

      |    27 | #include "         ^~~~~~~~~~~~~~~~~~~~~~~~l
lama.cpp/llama.h"
      |          ^~~~~~~~~~~~~~~~~~~
whisper.cpp/main.cpp:6:10: fatal error: 'llama.cpp/cores.h' file not found
    6 | #include "llama.cpp/cores.h"
      |          ^~~~~~~~~~~~~~~~~~~
In file included from whisper.cpp/mic2txt.cpp:19:
In file included from whisper.cpp/whisper.h:4:
whisper.cpp/ggml_extend.hpp:4:10: fatal error: 'llama.cpp/ggml-alloc.h' file not found
    4 | #include "llama.cpp/ggml-alloc.h"
      |          ^~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
make: *** [build/rules.mk:26: o//llamafile/tokenize.o] Error 1
make: *** Waiting for unfinished jobs....
1 error generated.
make: *** [build/rules.mk:26: o//whisper.cpp/mic2txt.o] Error 1
1 error generated.
make: *** [build/rules.mk:25: o//whisper.cpp/mic2raw.o] Error 1
1 error generated.
make: *** [build/rules.mk:25: o//whisper.cpp/grammar-parser.o] Error 1
1 error generated.
make: *** [build/rules.mk:25: o//whisper.cpp/main.o] Error 1
In file included from whisper.cpp/server.cpp:7:
In file included from whisper.cpp/whisper.h:4:
whisper.cpp/ggml_extend.hpp:4:10: fatal error: 'llama.cpp/ggml-alloc.h' file not found
    4 | #include "llama.cpp/ggml-alloc.h"
      |          ^~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.
make: *** [build/rules.mk:25: o//whisper.cpp/server.o] Error 1

The include files appear to have been restructured for llama.cpp, making the build fail. I didn't see an easy way to update the includes path, so I revised the includes manually:

includes.patch.txt

With these changes in place, the build still fails.

Eudox67 added bug critical severity labels Jan 21, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bug: New DeepSeek-R1-Distill-Qwen models do not load #684

Bug: New DeepSeek-R1-Distill-Qwen models do not load #684

Eudox67 commented Jan 21, 2025

quantumalchemy commented Jan 22, 2025

sholtomaud commented Jan 23, 2025

BradHutchings commented Jan 23, 2025

quantumalchemy commented Jan 24, 2025

Eudox67 commented Jan 24, 2025

evangineer commented Jan 26, 2025

Eudox67 commented Jan 27, 2025

quantumalchemy commented Jan 29, 2025

Xydane commented Jan 29, 2025

cjpais commented Jan 30, 2025

xor2003 commented Feb 6, 2025

stonez56 commented Feb 9, 2025

DonaldCMLIN commented Feb 14, 2025

cjpais commented Feb 14, 2025

DarkTyger commented Feb 24, 2025

Bug: New DeepSeek-R1-Distill-Qwen models do not load #684

Bug: New DeepSeek-R1-Distill-Qwen models do not load #684

Comments

Eudox67 commented Jan 21, 2025

Contact Details

What happened?

Version

What operating system are you seeing the problem on?

Relevant log output

quantumalchemy commented Jan 22, 2025

sholtomaud commented Jan 23, 2025

BradHutchings commented Jan 23, 2025

quantumalchemy commented Jan 24, 2025

Eudox67 commented Jan 24, 2025

evangineer commented Jan 26, 2025

Eudox67 commented Jan 27, 2025

quantumalchemy commented Jan 29, 2025

Xydane commented Jan 29, 2025

cjpais commented Jan 30, 2025

xor2003 commented Feb 6, 2025

stonez56 commented Feb 9, 2025

DonaldCMLIN commented Feb 14, 2025

cjpais commented Feb 14, 2025

DarkTyger commented Feb 24, 2025