A native output module based on piper TTS. #996

net-ddavies · 2025-01-22T07:20:16Z

'Native' means it's written in C++, so it does not incur the overhead of the generic output modulebased on the piper command line utility.

sthibaul

Thanks for this!

I have comments that are required before inclusion, but nothing really invasive. If you can't manage around configure.ac, I can work on that. We'll also need a CI to make sure it works, but again I can work on that.

sthibaul · 2025-01-22T09:47:34Z

config/modules/cxxpiper.conf

+# single speaker or multi-speaker.  Single speaker models produce a
+# single speech dispatcher voice.  The speech dispatcher voice name
+# can be listed with 'spd-say -o cxxpiper -L', but it is
+not needed as the voice will be the default, and only, voice available.


This line is missing a #

sthibaul · 2025-01-22T10:08:29Z

config/modules/cxxpiper.conf

+
+# ModelPath and ConfigPath are required.  There should be exactly one of each of them.
+ModelPath "/home/ddavies/src/piper-models/clean100.onnx"
+ConfigPath "/home/ddavies/src/piper-models/clean100.onnx.json"


We cannot hardcode such a path. Is there not standard path defined by piper? Something like /srv/something or /opt/something?

sthibaul · 2025-01-22T10:09:22Z

config/modules/cxxpiper.conf

+# file for a multi-speaker model contains a "speaker_id_map" object
+# that lists an integer speaker id and string mneumonic for each
+# speaker supported by the model.  Since speech dispatcher has no
+# notion of speaker id, speaker selection detqails are hidden from the


typo in detqails

sthibaul · 2025-01-22T10:11:26Z

config/modules/cxxpiper.conf

+# NB:  Unsure if onnx models may allow different " languages within
+# the same multi-speaker model.  REgardless, if there's sufficient memory: it might be possible to load multiple
+# models and have cxxpiper select between them, while presenting the union of
+# the speakers and languages of each model.


Yes, it will be nice long-term to be able to load several models. Even better, list all of them but load them only on-demand, so people don't have to modify any file but just install a model package and see the language pop-up in orca

sthibaul · 2025-01-22T10:13:07Z

config/modules/cxxpiper.conf

+#         english-us+male7                    en-US                    male7
+#       english-us+whisper                    en-US                  whisper
+#       english-us+female1                    en-US                  female1
+##         english-us+croak                    en-US                    croak


what are these lines?

sthibaul · 2025-01-22T10:31:59Z

src/modules/cxxpiper.cpp

+	    else {
+		piperConfig.useESpeak = false;
+	    }
+	    cxxpiper::initialize(piperConfig);


sthibaul · 2025-01-22T10:32:21Z

src/modules/cxxpiper.cpp

+	    }
+
+	    SPDVoice **module_list_voices(void)
+    {


There's odd indentation, it is not consistent

sthibaul · 2025-01-22T10:33:39Z

src/modules/cxxpiper.cpp

+	    cxxpiper_handle_sound_icon(data);
+	    break;
+	}
+	}


This double } is odd, better either indent the whole content of the switch, or the content of each case.

sthibaul · 2025-01-22T10:36:51Z

src/modules/cxxpiper.cpp

+	DBG("Sending begin event");
+	module_report_event_begin();
+	(void)cxxpiper::textToAudio(piperConfig, voice, cmdInp,
+				    audioBuffer, result, audioCallback);


Does piper support pipelining? To start sending audio to the server before the whole audio is produced. That would be really important to get good reactivity (but it's fine not to have it for first inclusion)

sthibaul · 2025-01-22T10:38:05Z

src/modules/cxxpiper.cpp

+	switch (msgtype) {
+	case SPD_MSGTYPE_CHAR:
+	case SPD_MSGTYPE_KEY:
+	case SPD_MSGTYPE_SPELL:


We shouldn't ignore them, that'd break some screen reading support.

sthibaul · 2025-01-22T22:54:42Z

We'll also need a CI to make sure it works, but again I can work on that.

I have added to CI a rule to install piper in /opt

net-ddavies · 2025-01-23T03:40:39Z

Grateful for your fixes to CI and Makefile.am -- thank you.

A native output module based on piper TTS.

d67516d

sthibaul requested changes Jan 22, 2025

View reviewed changes

samoverton mentioned this pull request Jan 24, 2025

sd_piper: add module for piper speech synthesis #998

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A native output module based on piper TTS. #996

A native output module based on piper TTS. #996

net-ddavies commented Jan 22, 2025

sthibaul left a comment

sthibaul Jan 22, 2025

sthibaul Jan 22, 2025

sthibaul Jan 22, 2025

sthibaul Jan 22, 2025

sthibaul Jan 22, 2025

sthibaul Jan 22, 2025

sthibaul Jan 22, 2025

sthibaul Jan 22, 2025

sthibaul Jan 22, 2025

sthibaul Jan 22, 2025

sthibaul commented Jan 22, 2025

net-ddavies commented Jan 23, 2025

A native output module based on piper TTS. #996

Are you sure you want to change the base?

A native output module based on piper TTS. #996

Conversation

net-ddavies commented Jan 22, 2025

sthibaul left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

sthibaul commented Jan 22, 2025

net-ddavies commented Jan 23, 2025