You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sorry, I just finished the previous question and I still have to ask you a new question. I use the DNABert2 model, whose original structure is as follows:
On three different classification tasks, I used OFT, LNTuning and other methods to add fine-tuning modules to linear layers such as ['Wqkv'/'wo'/'gated_layers'], or ['LayerNorm'] and other parts for supervised training. During the training process, the logs output at each logging_steps are normal, the loss keeps decreasing, and the performance indicators keep rising. However, when the saved model weights are finally called to evaluate on the independent test set, the performance will be very poor, and the results are basically equivalent to It's the same as having no training at all. The following error is reported when calling:
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at model/DNABERT2-117M and are newly initialized: ['bert.pooler.dense.bias', 'bert.pooler.dense.weight', 'classifier.bias', 'classifier.weight']
An officially supported task in the examples folder
My own task or dataset (give details below)
Reproduction
and the code I use to call the model from each path that holds the best model weights:
def load_best_model_for_test(checkpoint_dir, fold_number):
fold_dir = os.path.join(checkpoint_dir)
checkpoint_folders = [d for d in os.scandir(fold_dir) if d.is_dir() and d.name.startswith('checkpoint')]
best_model_dir = max(checkpoint_folders, key=lambda d: os.path.getmtime(d.path), default=None)
best_model_path = best_model_dir.path
model = AutoPeftModelForSequenceClassification.from_pretrained(best_model_path, trust_remote_code=True, num_labels=2)
return model
def evaluate_on_test_set(models, test_dataset):
test_results = []
for model in models:
trainer = Trainer(
model=model,
args=training_args,
eval_dataset=test_dataset,
data_collator=DataCollatorWithPadding(tokenizer=tokenizer),
compute_metrics=eval_predict
)
metrics = trainer.evaluate()
test_results.append(metrics)
average_metrics = {key: np.mean([result[key] for result in test_results]) for key in test_results[0].keys()}
return average_metrics
However, when I did full-parameter supervised fine-tuning without using peft, the final results on the independent test set were all normal. I changed different tasks, changed different peft methods, changed different parts of fine-tuning, and used the latest version of peft and still can't solve the problem.
Expected behavior
Find out the cause and fix the problem
The text was updated successfully, but these errors were encountered:
You can run this script using python example.py train to create the adapter model and then python example.py test to test the loading behavior. Expected is that the modified classifier weights are shown to be all ones.
System Info
Sorry, I just finished the previous question and I still have to ask you a new question. I use the DNABert2 model, whose original structure is as follows:
On three different classification tasks, I used OFT, LNTuning and other methods to add fine-tuning modules to linear layers such as ['Wqkv'/'wo'/'gated_layers'], or ['LayerNorm'] and other parts for supervised training. During the training process, the logs output at each logging_steps are normal, the loss keeps decreasing, and the performance indicators keep rising. However, when the saved model weights are finally called to evaluate on the independent test set, the performance will be very poor, and the results are basically equivalent to It's the same as having no training at all. The following error is reported when calling:
Who can help?
@BenjaminBossan
Information
Tasks
examples
folderReproduction
and the code I use to call the model from each path that holds the best model weights:
However, when I did full-parameter supervised fine-tuning without using peft, the final results on the independent test set were all normal. I changed different tasks, changed different peft methods, changed different parts of fine-tuning, and used the latest version of peft and still can't solve the problem.
Expected behavior
Find out the cause and fix the problem
The text was updated successfully, but these errors were encountered: