You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have a question I would like to share with the authors. I would be very grateful if you could reply.
As far as I understand, your work follows instructblip. However, in the original paper of instructblip, the LLM weight they used is vicuna_v1.1 instead of v0.1 here. Why did you choose different LLM weights?
In fact, I tried vicuna_v1.1 for training, but I encountered the error mentioned in the title, "ValueError: Attempting to unscale FP16 gradients". Through positioning, I found that the main problem may be caused by the following code in BLIVA/bliva/models/blip2_vicuna_instruct.py:
Thank you for your interest in our work. Could you please also try training with version 0.1 with the same setting to verify this is the problem? v1.1 and v0.1 are only different in tokenization and separator.
Thank you for your reply. Does the difference between v0.1 and v1.1 only exist in tokenization and separator? I tried generating with v0.1 and v1.1 respectively and got completely different results. Using a mismatched vicuna version for generation will result in confusing generation results.
I have a question I would like to share with the authors. I would be very grateful if you could reply.
As far as I understand, your work follows instructblip. However, in the original paper of instructblip, the LLM weight they used is vicuna_v1.1 instead of v0.1 here. Why did you choose different LLM weights?
In fact, I tried vicuna_v1.1 for training, but I encountered the error mentioned in the title, "ValueError: Attempting to unscale FP16 gradients". Through positioning, I found that the main problem may be caused by the following code in BLIVA/bliva/models/blip2_vicuna_instruct.py:
Did you encounter similar problems and therefore replaced v1.1 with v0.1?
The text was updated successfully, but these errors were encountered: