You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your great job of this repo and paper!
I notice that the result of ResNet-50 with non-conventional usage has the best performance. I want to know how to implement this ' non-conventional usage'.
Does it mean 'discarding the down-sampling operation between stage3 and stage4' in section 3.1 of the paper?
Thanks a lots.
The text was updated successfully, but these errors were encountered:
Thanks for the interest in our paper.
For your question, it means 'discarding the down-sampling operation between stage3 and stage4' in section 3.1 of the paper.
So, you are right :)
Thank you for your response !
So, 'discarding the down-sampling operation between stage3 and stage4' is changing the last stride of Resnet layer4 from 2 into 1 ? That is, last stride = 1 ?
These days I reimplement your great work with pytorch. When evaluating in CUB-200-2011, I just get 62% Recall@1. I believe I miss some important details.
So, some questions raising:
Batch sample: shuffer all samples and get 128 samples in a batch? or use P-K sampling format(P-classes, K samples per class)
I find that without L2 norm and FC after GD( like another issue said) could get higher performance(can't still reach your proposed results in CUB-200). Do you know the reasons about this?
Could you share some training tricks?
Looking forward to your guidance.
Thank you so much!
Thank you for your great job of this repo and paper!
I notice that the result of ResNet-50 with non-conventional usage has the best performance. I want to know how to implement this ' non-conventional usage'.
Does it mean 'discarding the down-sampling operation between stage3 and stage4' in section 3.1 of the paper?
Thanks a lots.
The text was updated successfully, but these errors were encountered: