Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for xlm-roberta #422

Open
umiron opened this issue Sep 23, 2024 · 2 comments
Open

Support for xlm-roberta #422

umiron opened this issue Sep 23, 2024 · 2 comments

Comments

@umiron
Copy link

umiron commented Sep 23, 2024

Is it possible to add support for xlm-roberta? It's the same architecture as roberta, except for a larger vocabulary since it is multi-lingual.

@metric-space
Copy link
Contributor

Hey @umiron
I believe there isn't anything within mergekit that is a barrier to inter- xlm-roberta related merges as the architecture format is tensor size oblivious.

If this really matches up with the xlm-roberta weight names and architecture, add the architecture name (XLMRobertaForMaskedLM) here locally and test to see if it works

@umiron
Copy link
Author

umiron commented Sep 24, 2024

Thanks, @metric-space. This works well (except in my case the change was to this file, since the relevant architecture was XLMRobertaModel rather than XLMRobertaForMaskedLM).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants