[finetuner-workflow]: issues encountered at download model step #353

dmarx · 2024-02-26T17:34:25Z

given tensorizer_uri pointing to the tensorizer s3 bucket and model provided as a huggingface identifier, the pipeline incorrectly attempts to use the model argument as a path on the PVC.

»  k logs llm-finetune-template-fph9l-model-tokenizer-1285859084 -c main
2024/02/26 16:43:36 Tokenizer definition: /finetune-data/models/mistralai/Mistral-7B-v0.1
2024/02/26 16:43:36 Tokenizer input source: /finetune-data/sep23_txt
2024/02/26 16:43:36 Tokenizer output: /finetune-data/sep23_txt-mistralai_Mistral_7B_v0_1-2048-b-1-7c82f97.tokens
2024/02/26 16:43:36 Tokenizer reordering method: 
2024/02/26 16:43:36 Sampling amount (in % tokens kept): 100%
2024/02/26 16:43:36 Resolving /finetune-data/models/mistralai/Mistral-7B-v0.1/config.json... 
2024/02/26 16:43:37 /finetune-data/models/mistralai/Mistral-7B-v0.1/config.json not found, required!
2024/02/26 16:43:37 cannot retrieve required `/finetune-data/models/mistralai/Mistral-7B-v0.1 from config.json`: HTTP status code 404
time="2024-02-26T16:43:37.956Z" level=info msg="sub-process exited" argo=true error="<nil>"
Error: exit status 1

when tensorizer_uri not provided, the model appears to download as it is supposed to from huggingface, but then throws an error that appears to be based on an attempted module name conversion.

»  k logs llm-finetune-template-4rxg9-model-downloader-781196140 -c main
2024/02/26 17:22:59 Resolving mistralai/Mistral-7B-v0.1/pytorch_model.bin.index.json... 
2024/02/26 17:22:59 Downloaded mistralai/Mistral-7B-v0.1/pytorch_model.bin.index.json... 24 kB completed.
2024/02/26 17:22:59 Resolving mistralai/Mistral-7B-v0.1/tokenizer.json... 
2024/02/26 17:23:00 Downloaded mistralai/Mistral-7B-v0.1/tokenizer.json... 1.8 MB completed.
2024/02/26 17:23:00 Resolving mistralai/Mistral-7B-v0.1/tokenizer_config.json... 
2024/02/26 17:23:00 Downloaded mistralai/Mistral-7B-v0.1/tokenizer_config.json... 967 B completed.
2024/02/26 17:23:00 Resolving mistralai/Mistral-7B-v0.1/config.json... 
2024/02/26 17:23:00 Downloaded mistralai/Mistral-7B-v0.1/config.json... 571 B completed.
2024/02/26 17:23:00 Resolving mistralai/Mistral-7B-v0.1/vocab.json... 
2024/02/26 17:23:00 Resolving mistralai/Mistral-7B-v0.1/merges.txt... 
2024/02/26 17:23:00 Resolving mistralai/Mistral-7B-v0.1/special_tokens_map.json... 
2024/02/26 17:23:00 Downloaded mistralai/Mistral-7B-v0.1/special_tokens_map.json... 72 B completed.
2024/02/26 17:23:00 Resolving mistralai/Mistral-7B-v0.1/wordtokens.json... 
2024/02/26 17:23:01 Vocab written to vocab.json from tokenizer.json
2024/02/26 17:23:01 Merges written to merges.txt from tokenizer.json
2024/02/26 17:23:01 Pytorch Model File exists: false
2024/02/26 17:23:01 Shard config exists: true
2024/02/26 17:23:01 Could not find number of shards from config: could not convert weight_map to embed_out
2024/02/26 17:23:01 Error downloading model resources: could not find number of shards from config
Error: Could not convert weight_map to embed_out
time="2024-02-26T17:23:01.853Z" level=info msg="sub-process exited" argo=true error="<nil>"
Error: exit status 1

The text was updated successfully, but these errors were encountered:

harubaru self-assigned this Mar 5, 2024

harubaru mentioned this issue Mar 21, 2024

LLM Finetuner Cleanup #376

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[finetuner-workflow]: issues encountered at download model step #353

[finetuner-workflow]: issues encountered at download model step #353

dmarx commented Feb 26, 2024

[finetuner-workflow]: issues encountered at download model step #353

[finetuner-workflow]: issues encountered at download model step #353

Comments

dmarx commented Feb 26, 2024