When HuggaFace transformers model file has tokenizer_config.json and tokenizer.json,how config the config.cfg file? #8907
-
How to reproduce the behaviourI use the model https://huggingface.co/hfl/chinese-roberta-wwm-ext/tree/main, the training config like below [components.transformer.model] @architectures = "spacy-transformers.TransformerModel.v1" name = "./chinese-roberta-wwm-ext" [components.transformer.model.tokenizer_config] use_fast = true but the model files contain the file
these files seems to not be used by the SpaCy. Your EnvironmentInfo about spaCy
|
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
When you load a Transformer/HuggingFace model with spaCy, it uses the HuggingFace code, so even if the config file doesn't mention it the model can do other stuff, which might include loading json config files. I'm a little unclear about what you're trying to do. Are you trying to customize the json config files, or does loading that model not work, or...? If you have tried something and it didn't work it'd be helpful to have any error messages you've gotten. |
Beta Was this translation helpful? Give feedback.
-
I'm facing the same issue, the |
Beta Was this translation helpful? Give feedback.
When you load a Transformer/HuggingFace model with spaCy, it uses the HuggingFace code, so even if the config file doesn't mention it the model can do other stuff, which might include loading json config files.
I'm a little unclear about what you're trying to do. Are you trying to customize the json config files, or does loading that model not work, or...? If you have tried something and it didn't work it'd be helpful to have any error messages you've gotten.