Questions about projection layer #74

Wgkai · 2024-07-25T11:31:01Z

https://github.com/rese1f/MovieChat/blob/a2d0da02f6cf16383ae1f9891ecf62e6f8f798a5/MovieChat/models/moviechat.py#L156C9-L176C1

        logging.info('Loading LLAMA proj')
        self.llama_proj = nn.Linear(
            self.Qformer.config.hidden_size, self.llama_model.config.hidden_size
        )
        if llama_proj_model:
            print("load llama proj weight: {}".format(llama_proj_model))
            llama_proj_weight = torch.load(llama_proj_model, map_location="cpu")
            msg = model.load_state_dict(llama_proj_weight['model'], strict=False)


        if frozen_llama_proj:
            #  todo frozen  llama_proj
            for name, param in self.llama_proj.named_parameters():
                param.requires_grad = False
            logging.info('LLAMA proj is frozen')
        else:
            for name, param in self.llama_proj.named_parameters():
                param.requires_grad = True
            logging.info('LLAMA proj is not frozen')


        logging.info('Loading llama_proj Done')

I notice that although I specified llama_proj_model in the config file, the parameter llama_proj_model passed in is still ' '.That is to say self.llama_proj is randomly initialized and the weights are not loaded successfully?

The text was updated successfully, but these errors were encountered:

Espere-1119-Song · 2024-07-25T13:16:32Z

Sorry for the confusing code. In fact, the weight of projection layer is loaded after the model initialized.

xuzq23 · 2024-08-29T07:34:11Z

I encounter a similar confusion,
the model path setting in the eval_configs/MovieChat.yaml

  llama_proj_model: 'ckpt/minigpt4/pretrained_minigpt4.pth'
  ckpt: "ckpt/finetune-vicuna7b-v2.pth"

according to your explanation above, would the llama_proj_model and ckpt(line 690) take the same effect?

Espere-1119-Song · 2024-08-29T07:37:35Z

Yes, they take the same effect.llama_proj_model seems not used in the code, and we will revise it.

xuzq23 · 2024-08-29T08:38:58Z

Yes, they take the same effect.llama_proj_model seems not used in the code, and we will revise it.

Thank you sincerely.

Wgkai changed the title ~~Question about llama~~ Question about llama_projection layer Jul 25, 2024

Wgkai changed the title ~~Question about llama_projection layer~~ Question about projection layer Jul 25, 2024

Wgkai changed the title ~~Question about projection layer~~ Questions about projection layer Jul 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Questions about projection layer #74

Questions about projection layer #74

Wgkai commented Jul 25, 2024 •

edited

Loading

Espere-1119-Song commented Jul 25, 2024

xuzq23 commented Aug 29, 2024 •

edited

Loading

Espere-1119-Song commented Aug 29, 2024

xuzq23 commented Aug 29, 2024

Questions about projection layer #74

Questions about projection layer #74

Comments

Wgkai commented Jul 25, 2024 • edited Loading

Espere-1119-Song commented Jul 25, 2024

xuzq23 commented Aug 29, 2024 • edited Loading

Espere-1119-Song commented Aug 29, 2024

xuzq23 commented Aug 29, 2024

Wgkai commented Jul 25, 2024 •

edited

Loading

xuzq23 commented Aug 29, 2024 •

edited

Loading