-
Notifications
You must be signed in to change notification settings - Fork 30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
stitching audio samples to generate diverse positive dataset #59
Comments
With #66 it is possible to generate enough samples for train/dev/test. First of all, I need to set up a proper evaluation setting. To improve the tp rate, restricting the stitching to smooth transitions between phonemes might be a good option |
tested both res8 and seq-lstm. |
Current experiment plan: |
hey_fire_fox should be able to give us the initial results but we might not have enough "real" samples for other wakewords. Unlike common voice, GSC is collected per-word. |
< seq-lstm >
< res8 >
|
To improve the audio sample quality, I have applied secondary filtering with keyword spotting |
Keyword Spotting verification definitely helped. < res8 >
|
found that the detection got a little better after putting more weights on "hey" (real datasets)
therefore, I double-checked the number of samples for each vocab used to generate a stitched dataset and found that it might be simply due to the small number of samples for "hey"
I believe, the phoneme-based stitching is inevitable |
explored bigger data sets (20000 samples)
tldr: detection is slightly better and dev/test acc is still not that great. I should focus on improving these first
|
@ljj7975 What do you mean about second filtering? |
for some wake words, we only get few dev/test samples as their transcript must be equal to wakeword
For example, when I attempted to generate dataset for
love you
, dev and test datasets only contained two samples each.Given that train set contains samples whose transcript contains at least one of the vocabs and aligned with the audio (by mfa)
we can possibly stitch some of the samples to generate synthetic wakeword samples
for example, stitching
hey baby
,I saw fire
andwow, there was a fox
to generate a sample forhey firefox
The text was updated successfully, but these errors were encountered: