Skip to content

Releases: KoljaB/RealtimeTTS

v0.4.47

09 Feb 10:48
Compare
Choose a tag to compare

RealtimeTTS v0.4.47 Release Notes

  • bugfix: paused streams could not be stopped

v0.4.46

08 Feb 19:49
Compare
Choose a tag to compare

RealtimeTTS v0.4.46 Release Notes

  • support for more kokoro voices (japanese, chinese) by installing with pip install "RealtimeTTS[kokoro,jp,zh]"

v0.4.43

04 Feb 15:58
Compare
Choose a tag to compare

RealtimeTTS v0.4.43 Release Notes

  • raises kokoro library dependency to version 0.7.3 from 2025-04-02 and therefore hopefully fixes #259

v0.4.42

03 Feb 19:09
Compare
Choose a tag to compare

RealtimeTTS v0.4.42 Release Notes

  • KokoroEngine can now be installed with pip install RealtimeTTS[kokoro] (does not need external installation anymore)
  • supports Kokoro-V1.0
  • support for more voices

v0.4.41

11 Jan 17:14
Compare
Choose a tag to compare

RealtimeTTS v0.4.41 Release Notes

New Feature: KokoroEngine Support

  • KokoroEngine Integration

    • Introduces support for the Kokoro 82M TTS engine.
    • Provides access to a variety of Kokoro voice models.
  • Installation:

    pip install realtimetts[all]==0.4.41
  • Setup Resources:

Usage Overview

from RealtimeTTS import TextToAudioStream, KokoroEngine

# Initialize Kokoro engine
engine = KokoroEngine(kokoro_root="path/to/Kokoro-82M")

# Switch voice as needed
engine.set_voice("af_sky")

# Create audio stream
stream = TextToAudioStream(engine)

# Feed and play audio using Kokoro voices
stream.feed("Hello world")
stream.play()

v0.4.40

06 Jan 12:07
Compare
Choose a tag to compare

RealtimeTTS v0.4.4 Release Notes

Configurable Playback Parameters

New Parameters: frames_per_buffer and playout_chunk_size

  • Purpose:

    • These new parameters provide finer control over audio playback buffering, which is especially useful for mitigating stuttering issues on Unix-based systems.
  • Details:

    1. frames_per_buffer:

      • Controls the number of audio frames processed per buffer by PyAudio.
      • Lower values reduce latency but increase CPU usage, while higher values reduce CPU load but increase latency.
      • Recommended Settings for Stuttering:
        • Start by setting frames_per_buffer to 256.
        • If issues persist, reduce it further to 128.

      Example:

      stream = TextToAudioStream(engine, frames_per_buffer=256)
    2. playout_chunk_size:

      • Specifies the size (in bytes) of audio chunks played out to the stream.
      • Works in conjunction with frames_per_buffer to optimize audio smoothness.
      • Defaults to dynamic calculation, but can be explicitly set for precise control.

      Example:

      stream = TextToAudioStream(engine, playout_chunk_size=1024)

How These Parameters Address Stuttering:

  • On Unix systems, default buffer sizes may cause sporadic stuttering during audio playback due to timing mismatches between the audio stream and system audio drivers.
  • By reducing frames_per_buffer to 256 or 128, the playback becomes more responsive and better aligned with system timing.
  • Adjusting playout_chunk_size further enhances playback smoothness by ensuring optimal chunk delivery to the audio stream.

Usage Examples

Basic Configuration:

from RealtimeTTS import TextToAudioStream, PiperEngine

engine = PiperEngine(piper_path="path/to/piper.exe", voice=my_voice)
stream = TextToAudioStream(
    engine=engine,
    frames_per_buffer=256,  # Start with 256 to reduce stuttering
    playout_chunk_size=1024 # Optional for further customization
)
stream.play()

Fine-Tuning for Stuttering:

  • If playback issues occur:
    1. Set frames_per_buffer to 256 (recommended starting point).
    2. Reduce to 128 if stuttering persists.
    3. Optionally adjust playout_chunk_size to a fixed value like 1024 or 512.

  • Backward Compatibility:
    • Defaults for frames_per_buffer and playout_chunk_size maintain compatibility with previous versions, requiring no changes for existing setups unless adjustments are needed.

v0.4.3

02 Jan 09:14
Compare
Choose a tag to compare

RealtimeTTS v0.4.3 Release Notes

New Feature: PiperEngine

  • Introduction

    • Introducing the PiperEngine to support the Piper text-to-speech model.
  • Installation

    • Separate Installation Required: Piper must be installed separately from RealtimeTTS. Follow the Piper installation tutorial for Windows to set up Piper on your system.

    • Install RealtimeTTS:

      pip install RealtimeTTS

      Note: Unlike other engines, there is no need to install Piper support with pip install RealtimeTTS[piper]. The [piper] option is not supported.

  • Usage

    • Configure PiperEngine:

      • Specify the path to the Piper executable and the desired voice model using the PiperVoice and PiperEngine classes.
      • Refer to the Piper test file for an example of how to set up and use PiperEngine in your projects.
    • Example:

      from RealtimeTTS import TextToAudioStream, PiperEngine, PiperVoice
      
      def dummy_generator():
          yield "This is piper tts speaking."
      
      voice = PiperVoice(
          model_file="D:/Downloads/piper_windows_amd64/piper/en_US-kathleen-low.onnx",
          config_file="D:/Downloads/piper_windows_amd64/piper/en_US-kathleen-low.onnx.json",
      )
      
      engine = PiperEngine(
          piper_path="D:/Downloads/piper_windows_amd64/piper/piper.exe",
          voice=voice,
      )
      
      stream = TextToAudioStream(engine)
      stream.feed(dummy_generator())
      stream.play()

Additional Information

v0.4.21

14 Dec 17:44
Compare
Choose a tag to compare

RealtimeTTS v0.4.21 Release Notes

🚀 New Features

  • update to latest versions of dependencies (stream2sentence, coqui-tts, elevenlabs, openai, edge-tts)

StyleTTS Engine

  • Added seed. Added fix for a styletts2 problem causing noise to be generated with very short texts, especially when using embedding_scale values > 1

🛠 Bug Fixes

  • Fixed a problem in stream2sentence causing minimum_sentence_length to not be respected

v0.4.20 🌿

10 Dec 22:05
Compare
Choose a tag to compare

RealtimeTTS v0.4.20 Release Notes

🚀 New Features

Azure Engine

  • Added support for 48 kHz audio output in the Azure TTS engine for improved audio quality (and providing more flexibility in audio formats).

StyleTTS Engine

  • introduced StyleTTSVoice for dynamic voice switching to allow transitions between multiple voice models

🛠 Bug Fixes

  • Fixed incorrect voice initialization when switching between models in the StyleTTS engine.
  • Fixed model configuration path issues during runtime when updating voice parameters.

v0.4.19

07 Dec 07:47
905f1fb
Compare
Choose a tag to compare
  • Added support for the StyleTTS2 engine.
  • Updated Coqui-TTS to version 0.25.0, which includes a fix for issue #227
  • Upgraded all dependent libraries to their latest versions