Details about seqkd #275

atiehsharifi · 2024-10-14T16:18:36Z

Hi. I looked at the code of the seqkd method. The difference between this method and the kd method is in the data set. In the seqkd method, by giving a prompt to the teacher model, you generate a text and add it to the data set with the "gen_answer" key. But the "gen_answer" key is not used anywhere, even when you preprocess this data set, you don't use this key.
I didn't understand the difference between seqkd and kd method in your code.

My second question is about the Dolly dataset. This dataset has 15,000 samples, but the data you are using has 13000 samples. Have you already processed this data set and removed samples whose length is greater than the model's maximum length?

t1101675 · 2024-10-14T16:23:51Z

Hi, you can refer to #250 for the first question and #167 for the second question

atiehsharifi · 2024-10-14T16:25:46Z

Hi, you can refer to #250 for the first question and #167 for the second question

thanks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Details about seqkd #275

Details about seqkd #275

atiehsharifi commented Oct 14, 2024 •

edited

Loading

t1101675 commented Oct 14, 2024

atiehsharifi commented Oct 14, 2024

Details about seqkd #275

Details about seqkd #275

Comments

atiehsharifi commented Oct 14, 2024 • edited Loading

t1101675 commented Oct 14, 2024

atiehsharifi commented Oct 14, 2024

atiehsharifi commented Oct 14, 2024 •

edited

Loading