-
Notifications
You must be signed in to change notification settings - Fork 289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor MAE to use timm ViT #1364
Comments
Hi, I am working on implementing this, should be able to submit a pull request soon. |
Hi @radiradev! This issue seems to have gotten lost, the MAE examples are in the meantime already updated to use TIMM ViT. See the docs and the benchmark. Is there something particular that you have found that is still missing? Otherwise I would close this issue. One thing we didn't yet have time to do is #1367 (I-JEPA with TIMM ViT). Let us know if you would be interested in working on that. |
Completed in #1461 |
We want to refactor the MAE code to use the timm ViT backbone instead of torchvision ViT. The reason for it is that torchvision ViT is too different from the ViT used in MAE and doesn't support all the features we need. See #1263 for some issues. Refactoring the code allows use to use the same backbone as was used in the original MAE code.
Ideally the refactoring allows us to be relatively flexible with the backbone we use so that users are free to use different ViT backbones.
Todo
The text was updated successfully, but these errors were encountered: