Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Details about seqkd #275

Open
atiehsharifi opened this issue Oct 14, 2024 · 2 comments
Open

Details about seqkd #275

atiehsharifi opened this issue Oct 14, 2024 · 2 comments

Comments

@atiehsharifi
Copy link

atiehsharifi commented Oct 14, 2024

Hi. I looked at the code of the seqkd method. The difference between this method and the kd method is in the data set. In the seqkd method, by giving a prompt to the teacher model, you generate a text and add it to the data set with the "gen_answer" key. But the "gen_answer" key is not used anywhere, even when you preprocess this data set, you don't use this key.
I didn't understand the difference between seqkd and kd method in your code.

My second question is about the Dolly dataset. This dataset has 15,000 samples, but the data you are using has 13000 samples. Have you already processed this data set and removed samples whose length is greater than the model's maximum length?

@t1101675
Copy link
Contributor

Hi, you can refer to #250 for the first question and #167 for the second question

@atiehsharifi
Copy link
Author

Hi, you can refer to #250 for the first question and #167 for the second question

thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants