kws_cli

About

Speech recognition in ~300kB of code.

Small footprint, standalone, zero dependency, offline keyword spotting (KWS) CLI tool. Might be useful in some automation task. Accepts audio on stdin a and recognizes following words: up, down, left, right, stop.

Here is an example invocation:

rec -q -t raw -c1 -e signed -b 16 -r16k - | ./kws_cli

Make sure you have decent microphone and the system audio is on a decent level.

Individual WAV files can piped (e.g. for testing) using:

sox -S ../untitled.wav -t raw -c1 -e signed -b 16 -r16k - | ./kws_cli

Demo

In the demo subdirectory there is a Python script showing how to use kws_cli for simple automation.

kws_cli_demo_1.mov

More details

Speech recognition is based on this model and examples from the same repository. This simple model with three layers: 2x LSTM + 1x fully connected. The model is trained in PyTorch and exported to ONNX. Then onnx2c is used to convert the model to a bunch of C code. The LSTM layers had become mainstream in recent years and are well supported in different frameworks. ~~The model is small, so it might be possible to run it on Cortex-M4/M7, or ESP32 (future work).~~ See below.

Building

The usual CMake routine:

mkdir build
cd build
cmake -DCMAKE_BUILD_TYPE=Release
make

Running in embedded systems context (TinyML/EdgeML)

This model was run on RP2040 and ESP32-S3.

The model runs on a 1s window of sound samples, so feature extraction and inference must take less than that in order to run continuously. Preferably there should also be an overlap between successive windows. On RP2040 the inference alone takes ~2.4s with 240MHz clock, so it's not possible to run real-time. The feature extraction also takes significant time. A smaller ("narrower") model was also tested and still the inference took ~1.2s. This is still impressive taking into account that RP2040 is a Cortex-M0+ without FPU.

On ESP32-S3 running at 240MHz inference with feature extraction takes ~0.5s, so running real-time is possible (e.g. every 750ms with 250ms overlap gives good results). A demo can be found here.

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
.github/workflows		.github/workflows
demo		demo
include		include
src		src
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

kws_cli

About

Demo

More details

Building

Running in embedded systems context (TinyML/EdgeML)

About

Releases

Languages

License

mryndzionek/kws_cli

Folders and files

Latest commit

History

Repository files navigation

kws_cli

About

Demo

More details

Building

Running in embedded systems context (TinyML/EdgeML)

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Languages