Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Validation of Fine-Tuning and Inference Methods for Multi-Turn Conversations with LLaMA 3.1 8B #1564

Open
Kshitiz-Khandel opened this issue Jan 20, 2025 · 1 comment

Comments

@Kshitiz-Khandel
Copy link

Kshitiz-Khandel commented Jan 20, 2025

I'm fine-tuning LLaMA 3.1 8B for multi-turn conversations and using this colab notebook as reference (which focuses on single-turn conversations).

Question 1:
Can you confirm whether the data format below is correct for preparing fine-tuning data for multi-turn conversations?

<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December 2023\nToday Date: 26 July 2024\n\n<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nHow's your asthma since you started using your inhaler again?<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nMuch better. I don't know why I didn't take it with me everywhere I went.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nIt's important to carry it with you, especially during times where you're exercising or walking more than usual.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nYeah. I think I've learned my lesson.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nBesides asthma, do you have any other medical problems?<|eot_id|>

Question 2:
Do I still use train_on_responses method to only train on the assistant outputs and ignore the loss on the user's inputs given the conversations are multi-turn?

eg: Before masking:
"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December 2023\nToday Date: 26 July 2024\n\n<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nHow's your asthma since you started using your inhaler again?<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nMuch better. I don't know why I didn't take it with me everywhere I went.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nIt's important to carry it with you, especially during times where you're exercising or walking more than usual.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nYeah. I think I've learned my lesson.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nBesides asthma, do you have any other medical problems?<|eot_id|>"

After masking:
" \n\nHow's your asthma since you started using your inhaler again?<|eot_id|> \n\nIt's important to carry it with you, especially during times where you're exercising or walking more than usual.<|eot_id|> \n\nBesides asthma, do you have any other medical problems?<|eot_id|>"

Question3.
For inference on a multi-turn conversation, is the following function the correct way to prepare input data? If not, can you suggest improvements or confirm its correctness?

from unsloth.chat_templates import get_chat_template

tokenizer = get_chat_template(
tokenizer,
chat_template="llama-3.1",
)

FastLanguageModel.for_inference(model)

messages = [
{'content': 'What brings you back into the clinic today, miss?', 'role': 'assistant'},
{'content': 'I came in for a refill of my blood pressure medicine.', 'role': 'user'},
{'content': 'It looks like Doctor Kumar followed up with you last time regarding your hypertension, osteoarthritis, osteoporosis, hypothyroidism, allergic rhinitis, and kidney stones. Have you noticed any changes or do you have any concerns regarding these issues?', 'role': 'assistant'},
{'content': 'No.', 'role': 'user'}
]

inputs = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True, # Required for generation
return_tensors="pt",
).to("cuda")

outputs = model.generate(
input_ids=inputs,
max_new_tokens=64,
use_cache=True,
temperature=1.5,
min_p=0.1
)

tokenizer.batch_decode(outputs)

Would you kindly validate if these approaches are appropriate for multi-turn fine-tuning and inference?

@Kshitiz-Khandel Kshitiz-Khandel changed the title Validation of Fine-Tuning and Inference Methods for Multi-Turn Conversations with LLaMA 3.1 8B" Validation of Fine-Tuning and Inference Methods for Multi-Turn Conversations with LLaMA 3.1 8B Jan 20, 2025
@danielhanchen
Copy link
Contributor

Yes all look correct! Apologies on the delay!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants