Unexpected Behavior After Merging Adapters #1593

FOCUSLLM · 2025-01-29T14:43:54Z

After fine-tuning, the model generates impressive responses. However, after merging the adapters, the responses become nonsensical or incoherent. Upon reloading, I encounter outputs with "helicansions" (garbled text or unexpected words).

What I Changed:
The only modification I made in the Colab notebook is related to dataset preparation. I have also tried adjusting padding tokens and other configurations, but nothing seems to resolve the issue.

Steps to Reproduce:
Fine-tune the model (works well at this stage).
Merge the adapters.
Generate responses (unexpected behavior occurs).
Reload the merged model and generate responses again (same issue persists).
Expected Behavior:
The model should retain the quality of responses seen after fine-tuning, even after merging adapters.

Additional Context:
Planning to host this model on Amazon Bedrock, so stability is crucial.
Have already tried adjusting padding tokens and other minor settings.
Issue persists across multiple attempts.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unexpected Behavior After Merging Adapters #1593

Unexpected Behavior After Merging Adapters #1593

FOCUSLLM commented Jan 29, 2025

Unexpected Behavior After Merging Adapters #1593

Unexpected Behavior After Merging Adapters #1593

Comments

FOCUSLLM commented Jan 29, 2025