You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi. I looked at the code of the seqkd method. The difference between this method and the kd method is in the data set. In the seqkd method, by giving a prompt to the teacher model, you generate a text and add it to the data set with the "gen_answer" key. But the "gen_answer" key is not used anywhere, even when you preprocess this data set, you don't use this key.
I didn't understand the difference between seqkd and kd method in your code.
My second question is about the Dolly dataset. This dataset has 15,000 samples, but the data you are using has 13000 samples. Have you already processed this data set and removed samples whose length is greater than the model's maximum length?
The text was updated successfully, but these errors were encountered:
Hi. I looked at the code of the seqkd method. The difference between this method and the kd method is in the data set. In the seqkd method, by giving a prompt to the teacher model, you generate a text and add it to the data set with the "gen_answer" key. But the "gen_answer" key is not used anywhere, even when you preprocess this data set, you don't use this key.
I didn't understand the difference between seqkd and kd method in your code.
My second question is about the Dolly dataset. This dataset has 15,000 samples, but the data you are using has 13000 samples. Have you already processed this data set and removed samples whose length is greater than the model's maximum length?
The text was updated successfully, but these errors were encountered: