Replies: 3 comments 1 reply
-
Yes, both of them. You can train a smaller model by distillating a trained larger model, the smaller model can be streaming or non-streaming even if the larger model is a non-streaming one. |
Beta Was this translation helpful? Give feedback.
0 replies
-
thank you! i will give it a ry! |
Beta Was this translation helpful? Give feedback.
0 replies
-
Hello, What knowledge distillation code should i use to train a smaller model based on an already trained larger zipformer? |
Beta Was this translation helpful? Give feedback.
1 reply
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello,
Can i use the knowledge distillation ( as in distillation_with_hubert.sh ) to train a smaller model based on an already trained larger zipformer b2 model or this can only be used for a fairseq model ?
And if so, can i use a large non streaming model to distill to a smaller streaming zipformer v2 ?
Best regards,
Joachim
Beta Was this translation helpful? Give feedback.
All reactions