Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[P1] Question regarding training flag. #139

Open
m-dev12 opened this issue Oct 20, 2024 · 4 comments
Open

[P1] Question regarding training flag. #139

m-dev12 opened this issue Oct 20, 2024 · 4 comments
Assignees
Labels
question Further information is requested

Comments

@m-dev12
Copy link

m-dev12 commented Oct 20, 2024

No description provided.

@m-dev12
Copy link
Author

m-dev12 commented Oct 20, 2024

Hi @frankaging ,

If I print the {self.training flag} in the forward pass function of Loreftintervention, I see it to be false after the first epoch.
Is this behavior expected? How else can I track if model is in train or eval mode?

I was trying to figure out why this may be the case:
My hypothesis is that evaluate in class ReftTrainerForSequenceClassification(ReftTrainer) runs after each epoch.

image

but the training step in HF trainer does not turn on the training mode in the same way?
https://github.com/huggingface/transformers/blob/174890280b340b89c5bfa092f6b4fb0e2dc2d7fc/src/transformers/trainer.py#L3311

image

the loreft scripts also had this line https://github.com/stanfordnlp/pyreft/blob/bc8a49c6e5307e7d67c910292d4035a1384c1790/examples/loreft/original_code/task_steer.py#L261C5-L261C29 before training.

Could you help me understand if this is expected or this is an issue? How else can I track if model is in train or eval mode?

Thanks!

@m-dev12
Copy link
Author

m-dev12 commented Oct 20, 2024

Also, if this is an issue it can have implications on whether interventions are being trained on in subsequent epochs, or is it just the classifier head being trained?

@frankaging
Copy link
Collaborator

@m-dev12 Hey, thanks for the input - it might take me a while to get back to this with actual testings since I am busy with other stuff right now. But here are a couple of pointers for you to test out stuff:

  • set allow_cls_grad to False and see if you can still train? i think even with a random head, it will get pretty decent accuracy. This also means the interventions are receiving gradients and the optimizer is updated their weights.
  • print out weights at each step?
  • for all of other experiments, there is no head and the interventions are trainable, so probably it does not mean we are only training the head here?
  • this issue could arise because of transformers versioning as well.

Keep us posted of your findings!

@frankaging frankaging changed the title Question regarding training flag. [P1] Question regarding training flag. Oct 21, 2024
@frankaging frankaging self-assigned this Oct 21, 2024
@frankaging frankaging added the question Further information is requested label Oct 21, 2024
@m-dev12
Copy link
Author

m-dev12 commented Oct 21, 2024

Sure @frankaging thanks for the input! Let me try out a few things next week, I'll let you know. And yes, version might also be an issue, since I am on the latest transformer version.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants