You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hey @umiron
I believe there isn't anything within mergekit that is a barrier to inter- xlm-roberta related merges as the architecture format is tensor size oblivious.
If this really matches up with the xlm-roberta weight names and architecture, add the architecture name (XLMRobertaForMaskedLM) here locally and test to see if it works
Thanks, @metric-space. This works well (except in my case the change was to this file, since the relevant architecture was XLMRobertaModel rather than XLMRobertaForMaskedLM).
Is it possible to add support for xlm-roberta? It's the same architecture as roberta, except for a larger vocabulary since it is multi-lingual.
The text was updated successfully, but these errors were encountered: