You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm fine-tuning LLaMA 3.1 8B for multi-turn conversations and using this colab notebook as reference (which focuses on single-turn conversations).
Question 1: Can you confirm whether the data format below is correct for preparing fine-tuning data for multi-turn conversations?
<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December 2023\nToday Date: 26 July 2024\n\n<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nHow's your asthma since you started using your inhaler again?<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nMuch better. I don't know why I didn't take it with me everywhere I went.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nIt's important to carry it with you, especially during times where you're exercising or walking more than usual.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nYeah. I think I've learned my lesson.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nBesides asthma, do you have any other medical problems?<|eot_id|>
Question 2: Do I still use train_on_responses method to only train on the assistant outputs and ignore the loss on the user's inputs given the conversations are multi-turn?
eg: Before masking:
"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December 2023\nToday Date: 26 July 2024\n\n<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nHow's your asthma since you started using your inhaler again?<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nMuch better. I don't know why I didn't take it with me everywhere I went.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nIt's important to carry it with you, especially during times where you're exercising or walking more than usual.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nYeah. I think I've learned my lesson.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nBesides asthma, do you have any other medical problems?<|eot_id|>"
After masking:
" \n\nHow's your asthma since you started using your inhaler again?<|eot_id|> \n\nIt's important to carry it with you, especially during times where you're exercising or walking more than usual.<|eot_id|> \n\nBesides asthma, do you have any other medical problems?<|eot_id|>"
Question3. For inference on a multi-turn conversation, is the following function the correct way to prepare input data? If not, can you suggest improvements or confirm its correctness?
from unsloth.chat_templates import get_chat_template
messages = [
{'content': 'What brings you back into the clinic today, miss?', 'role': 'assistant'},
{'content': 'I came in for a refill of my blood pressure medicine.', 'role': 'user'},
{'content': 'It looks like Doctor Kumar followed up with you last time regarding your hypertension, osteoarthritis, osteoporosis, hypothyroidism, allergic rhinitis, and kidney stones. Have you noticed any changes or do you have any concerns regarding these issues?', 'role': 'assistant'},
{'content': 'No.', 'role': 'user'}
]
Would you kindly validate if these approaches are appropriate for multi-turn fine-tuning and inference?
The text was updated successfully, but these errors were encountered:
Kshitiz-Khandel
changed the title
Validation of Fine-Tuning and Inference Methods for Multi-Turn Conversations with LLaMA 3.1 8B"
Validation of Fine-Tuning and Inference Methods for Multi-Turn Conversations with LLaMA 3.1 8B
Jan 20, 2025
I'm fine-tuning LLaMA 3.1 8B for multi-turn conversations and using this colab notebook as reference (which focuses on single-turn conversations).
Question 1:
Can you confirm whether the data format below is correct for preparing fine-tuning data for multi-turn conversations?
<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December 2023\nToday Date: 26 July 2024\n\n<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nHow's your asthma since you started using your inhaler again?<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nMuch better. I don't know why I didn't take it with me everywhere I went.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nIt's important to carry it with you, especially during times where you're exercising or walking more than usual.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nYeah. I think I've learned my lesson.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nBesides asthma, do you have any other medical problems?<|eot_id|>
Question 2:
Do I still use train_on_responses method to only train on the assistant outputs and ignore the loss on the user's inputs given the conversations are multi-turn?
eg: Before masking:
"<|begin_of_text|><|start_header_id|>system<|end_header_id|>\n\nCutting Knowledge Date: December 2023\nToday Date: 26 July 2024\n\n<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nHow's your asthma since you started using your inhaler again?<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nMuch better. I don't know why I didn't take it with me everywhere I went.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nIt's important to carry it with you, especially during times where you're exercising or walking more than usual.<|eot_id|><|start_header_id|>user<|end_header_id|>\n\nYeah. I think I've learned my lesson.<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\nBesides asthma, do you have any other medical problems?<|eot_id|>"
After masking:
" \n\nHow's your asthma since you started using your inhaler again?<|eot_id|> \n\nIt's important to carry it with you, especially during times where you're exercising or walking more than usual.<|eot_id|> \n\nBesides asthma, do you have any other medical problems?<|eot_id|>"
Question3.
For inference on a multi-turn conversation, is the following function the correct way to prepare input data? If not, can you suggest improvements or confirm its correctness?
from unsloth.chat_templates import get_chat_template
tokenizer = get_chat_template(
tokenizer,
chat_template="llama-3.1",
)
FastLanguageModel.for_inference(model)
messages = [
{'content': 'What brings you back into the clinic today, miss?', 'role': 'assistant'},
{'content': 'I came in for a refill of my blood pressure medicine.', 'role': 'user'},
{'content': 'It looks like Doctor Kumar followed up with you last time regarding your hypertension, osteoarthritis, osteoporosis, hypothyroidism, allergic rhinitis, and kidney stones. Have you noticed any changes or do you have any concerns regarding these issues?', 'role': 'assistant'},
{'content': 'No.', 'role': 'user'}
]
inputs = tokenizer.apply_chat_template(
messages,
tokenize=True,
add_generation_prompt=True, # Required for generation
return_tensors="pt",
).to("cuda")
outputs = model.generate(
input_ids=inputs,
max_new_tokens=64,
use_cache=True,
temperature=1.5,
min_p=0.1
)
tokenizer.batch_decode(outputs)
Would you kindly validate if these approaches are appropriate for multi-turn fine-tuning and inference?
The text was updated successfully, but these errors were encountered: