Skip to content

A Text To Speech node using Kokoro TTS in ComfyUI

License

Notifications You must be signed in to change notification settings

ardentillumina/ComfyUI-KokoroTTS

 
 

Repository files navigation

Kokoro TextToSpeech Node for ComfyUI

A custom node for ComfyUI that provides Text-to-Speech capabilities using the Kokoro TTS engine.

image The basic TTS

image TTS with LatentSync for Lipsync

2025-01-20.20-55-55.mp4

Example Result.

Features

  • High-quality text-to-speech synthesis
  • Multiple voice options
  • Support for multilingual text
  • Easy integration with ComfyUI workflows

Installation

  1. Clone this repository into your ComfyUI custom nodes directory:
cd ComfyUI/custom_nodes
git clone https://github.com/benjiyaya/ComfyUI-KokoroTTS
  1. Download required model files:

    • Create a folder Kokorotts under ComfyUI/models
    • Go to https://huggingface.co/thewh1teagle/Kokoro/tree/main
    • Download the model 'kokoro-v0_19.onnx' file and save to 'Kokorotts' folder
    • Download the voices 'voices.json' file and save to 'Kokorotts' folder
    • Place both files in the ComfyUI/models/Kokorotts directory
  2. Install required Python packages:

pip install -r requirements.txt

or

if you are using window protable version.

Go to 'ComfyUI_windows_portable' folder
 run the command : "python_embeded\python.exe -m pip install -r ComfyUI\custom_nodes\ComfyUI-KokoroTTS\requirements.txt"

Available Voices

The following voices are available:

  • af (American Female)
  • af_sarah (American Female Sarah)
  • af_bella (American Female Bella)
  • af_nicole (American Female Nicole)
  • af_sky (American Female Sky)
  • am_adam (American Male Adam)
  • am_michael (American Male Michael)
  • bf_emma (British Female Emma)
  • bf_isabella (British Female Isabella)
  • bm_george (British Male George)
  • bm_lewis (British Male Lewis)

Usage

  1. In ComfyUI, locate the "Kokoro TextToSpeech" node under the "kokoro" category
  2. Connect the node to your workflow
  3. Input your text and select a voice
  4. The node will output an audio waveform that can be used with other audio nodes

Input Parameters

  • text: The text you want to convert to speech (supports multiline text)
  • speaker: The voice to use for speech synthesis (default: af_sarah)

Output

  • audio: Audio data in the format expected by ComfyUI audio nodes

Error Handling

The node includes comprehensive error handling for common issues:

  • Missing model or voice files
  • Invalid text input
  • TTS generation failures

Error messages will be logged with detailed information to help troubleshoot any issues.

License

kokoro-onnx: MIT kokoro model: Apache 2.0

Credits

About

A Text To Speech node using Kokoro TTS in ComfyUI

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%