Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected Behavior After Merging Adapters #1593

Open
FOCUSLLM opened this issue Jan 29, 2025 · 0 comments
Open

Unexpected Behavior After Merging Adapters #1593

FOCUSLLM opened this issue Jan 29, 2025 · 0 comments

Comments

@FOCUSLLM
Copy link

After fine-tuning, the model generates impressive responses. However, after merging the adapters, the responses become nonsensical or incoherent. Upon reloading, I encounter outputs with "helicansions" (garbled text or unexpected words).

What I Changed:
The only modification I made in the Colab notebook is related to dataset preparation. I have also tried adjusting padding tokens and other configurations, but nothing seems to resolve the issue.

Steps to Reproduce:
Fine-tune the model (works well at this stage).
Merge the adapters.
Generate responses (unexpected behavior occurs).
Reload the merged model and generate responses again (same issue persists).
Expected Behavior:
The model should retain the quality of responses seen after fine-tuning, even after merging adapters.

Additional Context:
Planning to host this model on Amazon Bedrock, so stability is crucial.
Have already tried adjusting padding tokens and other minor settings.
Issue persists across multiple attempts.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant