-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A native output module based on piper TTS. #996
base: master
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,114 @@ | ||
# | ||
# Configuration for cxxpiper speech dispatcher output module. | ||
# | ||
|
||
Debug 0 | ||
|
||
# Piper doesn't have voices ala speech- dispatcher. Piper has a | ||
# "model" and the model's "configuration". A model/config may be | ||
# single speaker or multi-speaker. Single speaker models produce a | ||
# single speech dispatcher voice. The speech dispatcher voice name | ||
# can be listed with 'spd-say -o cxxpiper -L', but it is | ||
not needed as the voice will be the default, and only, voice available. | ||
|
||
# Piper multi-speaker models produce a discrete speech dispatcher | ||
# voice for each speaker the model suports. The configuration | ||
# file for a multi-speaker model contains a "speaker_id_map" object | ||
# that lists an integer speaker id and string mneumonic for each | ||
# speaker supported by the model. Since speech dispatcher has no | ||
# notion of speaker id, speaker selection detqails are hidden from the | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. typo in |
||
# user by instead exposing voices of the form | ||
# <model-name>~<speaker-id>~mneumonic with the output module | ||
# mapping between "voice names" and the current model's speakers. | ||
|
||
# NB: Unsure if onnx models may allow different " languages within | ||
# the same multi-speaker model. REgardless, if there's sufficient memory: it might be possible to load multiple | ||
# models and have cxxpiper select between them, while presenting the union of | ||
# the speakers and languages of each model. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, it will be nice long-term to be able to load several models. Even better, list all of them but load them only on-demand, so people don't have to modify any file but just install a model package and see the language pop-up in orca |
||
# For now all directives with language fields require the language code, but ignore it. | ||
|
||
# ModelPath and ConfigPath are required. There should be exactly one of each of them. | ||
ModelPath "/home/ddavies/src/piper-models/clean100.onnx" | ||
ConfigPath "/home/ddavies/src/piper-models/clean100.onnx.json" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We cannot hardcode such a path. Is there not standard path defined by piper? Something like /srv/something or /opt/something? |
||
|
||
# For single-speaker models, DefaultVoice is ignored, and logged as such, with | ||
# a warning. For multi-speaker models, DefaultVoice is optional. If it is | ||
# not specified, the first speaker of the multi-speaker model becomes | ||
# the default speaker for the lifetime of the cxxpiper output module | ||
# and also for future runtimes unless this configuration is changed. | ||
# When specified along with a multi-speaker model, the argument is a string that matches one of the | ||
# "voices" listed by spd-say -o cxxpiper -L . Note that piper's | ||
# notion of "speaker" appears to the user as the "voice" concept of | ||
# speech dispatcher. This is pretty much invisible to the user, but | ||
# note that it means that the strings listed in the .json | ||
# configuration file in the speaker_id_map object are not the same as | ||
# the voices listed by spd-say (i.e. the speaker ids are substrings of | ||
# the listed voices). We could also match the substrings, but we | ||
# don't at least for now, only the "voice" string string is | ||
# recognized. If the voice string can't be matched against the voices | ||
# found when the model is loaded, then the first speaker becomes the | ||
# default for the lifetime of the output module. A warning is logged | ||
# if the string can't be matched and the voice name of the first | ||
# speaker (index 0) is included in the warning message. | ||
DefaultVoice "clean100~2~5393" | ||
|
||
# AddVoice (optional) reused from the generic output module. This maps types to voice names within a language code. | ||
# It does not do anything useful for single speaker models and is ignored. For | ||
# multi-speaker models the language code is required, but ignored, at least for now. | ||
AddVoice "en_US" "MALE1" "clean100~33~8419" | ||
AddVoice "en_US" "FEMALE1" "clean100~25~4137" | ||
|
||
#ModelPath "/home/ddavies/src/piper/src/python_run/en_US-joe-medium.onnx" | ||
#ConfigPath "/home/ddavies/src/piper/src/python_run/en_US-joe-medium.onnx.json" | ||
#AddVoice "en_US" "MALE1" "" | ||
|
||
#ModelPath "/home/ddavies/src/piper-models/en_US-joe-medium.onnx" | ||
#ConfigPath "/home/ddavies/src/piper-models/en_US-joe-medium.onnx.json" | ||
|
||
#ModelPath "/home/ddavies/src/piper-models/en_US-amy-medium.onnx" | ||
#ConfigPath "/home/ddavies/src/piper-models/en_US-amy-medium.onnx.json" | ||
#AddVoice "en_US" "FEMALE1" "" | ||
|
||
#ModelPath "/home/ddavies/src/piper-models/en_US-l2arctic-medium.onnx" | ||
#ConfigPath "/home/ddavies/src/piper-models/en_US-l2arctic-medium.onnx.json" | ||
#en_US-l2arctic-medium~SVBI~2 | ||
#DefaultVoice "english-us+male1" | ||
#DefaultVoice "en_US-l2arctic-medium~2~SVBI" | ||
# english-us en-US none | ||
# english-us+female2 en-US female2 | ||
# english-us+female3 en-US female3 | ||
# english-us+female4 en-US female4 | ||
# english-us+female5 en-US female5 | ||
#english-us+female_whisper en-US female_whisper | ||
# english-us+klatt en-US klatt | ||
# english-us+klatt2 en-US klatt2 | ||
# english-us+klatt3 en-US klatt3 | ||
# english-us+klatt4 en-US klatt4 | ||
# english-us+male2 en-US male2 | ||
# english-us+male3 en-US male3 | ||
# english-us+male4 en-US male4 | ||
# english-us+male5 en-US male5 | ||
# english-us+male6 en-US male6 | ||
# english-us+male7 en-US male7 | ||
# english-us+whisper en-US whisper | ||
# english-us+female1 en-US female1 | ||
## english-us+croak en-US croak | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. what are these lines? |
||
#AddVoice "en-US" "MALE1" "en_US-l2arctic-medium~14~YKWK" | ||
#AddVoice "en-US" "FEMALE1" "en_US-l2arctic-medium~9~PNI" | ||
|
||
# Sound Icons are configured and work like espeak. | ||
SoundIconFolder "/usr/share/sounds/sound-icons/" | ||
SoundIconVolume 0 | ||
|
||
# These are optional. 'None' and 'All' are constant -- the former meaning as much omission and the latter | ||
# meaning as little, of punctuation, as possible. | ||
PunctSome "-_+*/\\&$%^~!@#" | ||
PunctMost "(){}[]<>;:-_+*/\\&$%^~!@#\"'" | ||
|
||
# Piper uses ESpeak NG sometimes, depending on the model. Piper distributes this and distros | ||
# may provide it. Default is "/usr/share/espeak-ng-data/". | ||
# It should probably be considered required, but if a model doesn't use espeak | ||
# it might work to omit it. | ||
ESpeakNGDataDirPath "/home/ddavies/src/piper/install/espeak-ng-data" | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The default should be fine, so leave it as such in the configuration file. |
||
|
||
# End of cxxpiper.conf |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -287,6 +287,25 @@ CLEANFILES += $(EXTRA_sd_kali_DEPENDENCIES) | |
endif | ||
endif | ||
|
||
# | ||
# cxxpiper | ||
# | ||
modulebin_PROGRAMS += sd_cxxpiper | ||
PIPER_LIB_DIR = /usr/lib/piper | ||
PIPER_SRC_DIR = /usr/local/src/piper | ||
sd_cxxpiper_SOURCES = cxxpiper.cpp module_utils_addvoice.c module_utils_play.c $(common_SOURCES) | ||
sd_cxxpiper_CPPFLAGS = -I/usr/include/piper_phonemize/include/ \ | ||
-I$(PIPER_SRC_DIR)/src/cpp/ \ | ||
$(AM_CPPFLAGS) | ||
sd_cxxpiper_LDADD = $(top_builddir)/src/common/libcommon.la \ | ||
-L$(PIPER_LIB_DIR) \ | ||
-lpiper_phonemize \ | ||
-lonnxruntime \ | ||
-lespeak-ng \ | ||
-lrubberband \ | ||
$(SNDFILE_LIBS) \ | ||
$(common_LDADD) | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We really need to make this optional, because at lot of people will not have piper installed on their system. Actually pipe don't seem to have a real installation makefile target that would properly install header files? This really should be fixed by upstream, otherwise it'll be a mess... This is normally done in |
||
|
||
# | ||
# voxin module | ||
# | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This line is missing a
#