-
Notifications
You must be signed in to change notification settings - Fork 130
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Qustion about the update method of ema_model #16
Comments
Good questions. |
Great, thank you! |
Thanks for your explanation. I noticed that you copy the BN parameters from self.tmp_model instead of self.model. Could you please explain why is this?
|
@xmengli999 He only wants to copy batch norm statistic (running mean and std which is crucial for inference) from self.model (current model) to ema_model at the end of every epoch. This is because batch normalization parameters such as running_mean and running_var are not included in model.parameters(). However, they appear in the model[‘state_dict’]. I think for ema_model BN parameter should also be exponentially moving averaged? since, BN parameter of tmp_model = BN parameter of ema_model. |
@YU1ut thanks for the implementation ! a few more questions on WeightEMA class:
What does it do?
Thanks! |
Hello:
I am Sanyou, thanks for your contribution and sorry for disturbing you again. I have two question as follows:
In the the class WeightEMA(object)
Firstly, What's it for ? (param.data.mul_(1 - self.wd)) .
Secondly, I am curious about why don't you use the update ema_model mothod in Mean-Teacher.(I tried it, but the result is completely failed. Additionally, If I don't use mix-up trick in your code framework(I think it should equals to Mean-Teacher), But I can't get ideal result as Mean-Teacher(1000 labels 77%(mine) < 79%(mean-teacher), 4000 labels 87% < 88%(mean-teacher ).
@YU1ut
The text was updated successfully, but these errors were encountered: