Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FileNotFoundError when using SentenceTransformerTrainingArguments(load_best_model_at_end=True) and Peft #34747

Open
1 of 4 tasks
GTimothee opened this issue Nov 15, 2024 · 0 comments
Labels

Comments

@GTimothee
Copy link

GTimothee commented Nov 15, 2024

System Info

I used google colab default environment, with last version of transformers and sentence-transformers

Who can help?

@muellerzr

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Here is the example code as a gist: gist

Just open the gist in a colab notebook and run it

Expected behavior

This is a follow up about another bug found in sentence-transformers. The sentence-transformers library just integrated peft using transformers.integrations. The bug is that when using SentenceTransformerTrainingArguments(load_best_model_at_end=True) there is a FileNotFoundError as we try to load a classical checkpoint file (pth) but we saved an adapter instead. When looking into the load_best_model function, it just uses the function from the transformers.trainer.Trainer. So we need to modify the transformers library to solve the problem.

The issue is that there is a function in transformers that checks if the model is a PeftMixedModel or not. If not, it is not considered a peft model and the trainer tries to load the model as usual. The problem is our model is a PeftAdapterMixin so it is not recognized as a peft model.

See also: UKPLab/sentence-transformers#3056

In my opinion, we need to add to the check a 2-step check 1) is it a PeftAdapterMixin and 2) has it adapters loaded? Maybe it is only one part of the solution though, and we need a special loading snippet in the transformers.trainer.Trainer._load_best_model directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant