"ValueError: Attempting to unscale FP16 gradients" in vicuna_v1.1 #9

Zhudongsheng75 · 2023-09-29T16:48:49Z

I have a question I would like to share with the authors. I would be very grateful if you could reply.

As far as I understand, your work follows instructblip. However, in the original paper of instructblip, the LLM weight they used is vicuna_v1.1 instead of v0.1 here. Why did you choose different LLM weights?

In fact, I tried vicuna_v1.1 for training, but I encountered the error mentioned in the title, "ValueError: Attempting to unscale FP16 gradients". Through positioning, I found that the main problem may be caused by the following code in BLIVA/bliva/models/blip2_vicuna_instruct.py:

self.llm_tokenizer.add_special_tokens({'pad_token': '[PAD]'})
self.llm_tokenizer.add_special_tokens({'bos_token': '</s>'})
self.llm_tokenizer.add_special_tokens({'eos_token': '</s>'})
self.llm_tokenizer.add_special_tokens({'unk_token': '</s>'})

self.llm_model.resize_token_embeddings(len(self.llm_tokenizer))

Did you encounter similar problems and therefore replaced v1.1 with v0.1?

gordonhu608 · 2023-09-29T18:05:46Z

Thank you for your interest in our work. Could you please also try training with version 0.1 with the same setting to verify this is the problem? v1.1 and v0.1 are only different in tokenization and separator.

Zhudongsheng75 · 2023-09-30T07:30:16Z

Thank you for your reply. Does the difference between v0.1 and v1.1 only exist in tokenization and separator? I tried generating with v0.1 and v1.1 respectively and got completely different results. Using a mismatched vicuna version for generation will result in confusing generation results.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"ValueError: Attempting to unscale FP16 gradients" in vicuna_v1.1 #9

"ValueError: Attempting to unscale FP16 gradients" in vicuna_v1.1 #9

Zhudongsheng75 commented Sep 29, 2023 •

edited

Loading

gordonhu608 commented Sep 29, 2023

Zhudongsheng75 commented Sep 30, 2023

"ValueError: Attempting to unscale FP16 gradients" in vicuna_v1.1 #9

"ValueError: Attempting to unscale FP16 gradients" in vicuna_v1.1 #9

Comments

Zhudongsheng75 commented Sep 29, 2023 • edited Loading

gordonhu608 commented Sep 29, 2023

Zhudongsheng75 commented Sep 30, 2023

Zhudongsheng75 commented Sep 29, 2023 •

edited

Loading