Skip to content

Script is a voice recording and transcription tool that uses several libraries to record audio, transcribe it, and then paste the transcription into the current active window. The script is designed to be controlled through keyboard shortcuts, specifically using the Ctrl+R combination to start and stop recording.

License

Notifications You must be signed in to change notification settings

smian1/Whisper-Voice-Transcription

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Whisper Voice Transcription Tool

This Python script offers a convenient way to record audio through the microphone and transcribe it using OpenAI's powerful API. The transcription is automatically copied to the clipboard and pasted into the currently active window, making it ideal for quickly capturing and transcribing spoken words.

Features

  • Audio recording with a simple keyboard shortcut.
  • Transcription of recorded audio using OpenAI's API.
  • Automatic pasting of the transcribed text into any text input field.
  • Easy to use with minimal setup required.

Prerequisites

Before running the script, ensure you have the following:

  • Python 3.x installed on your system.
  • An OpenAI API key (sign up at OpenAI to obtain one).

Installation

  1. Clone the repository to your local machine:
    git clone https://github.com/smian1/Whisper-Voice-Transcription.git
    
  2. Navigate to the cloned directory:
    cd Whisper-Voice-Transcription
  3. Install the required dependencies:
     pip install -r requirements.txt
    

Usage

  1. Run the script:
    python voice_transcription.py
    
  2. Start recording by pressing Ctrl+R. Speak into your microphone.
  3. Stop recording by pressing Ctrl+R again. The script will transcribe the audio and paste the transcription into the current active window.

API Key Configuration

Place your OpenAI API key in a file named openai_api_key.txt in the same directory as the script.

Running in a Virtual Environment (Optional)

Using a virtual environment is recommended as it keeps dependencies required by different projects separate. Here's how you can set up and use a virtual environment for this script:

  1. Install Virtualenv (if not already installed):

    pip install virtualenv
    
  2. Create a Virtual Environment: Navigate to the project directory and run:

    virtualenv venv

This command creates a new directory named venv in your project directory, which contains the virtual environment.

  1. Activate the Virtual Environment:
  • On macOS and Linux:
    source venv/bin/activate
    
  • On Windows:
    .\venv\Scripts\activate
    
  1. Install Dependencies: With the virtual environment activated, install the project dependencies:

    pip install -r requirements.txt
  2. Run the Script: Still in the virtual environment, you can now run the script:

    python voice_transcription.py
    
  3. Deactivate the Virtual Environment: Once you're done, you can deactivate the virtual environment by running:

    deactivate
    

Using a virtual environment ensures that your project's dependencies are isolated and do not interfere with other Python projects.

Contributing

Contributions to this project are welcome! Please feel free to fork the repository, make improvements, and submit pull requests.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Acknowledgments and Learning Resources

This project leverages the OpenAI API for transcription capabilities. The development of this script was greatly aided by the comprehensive documentation provided by OpenAI. If you're looking to understand more about how the OpenAI API works or want to explore its extensive capabilities, their documentation is an excellent place to start.

OpenAI API Documentation

The OpenAI API documentation is a valuable resource for anyone interested in integrating advanced AI features into their applications. It offers detailed guidance on how to use the API, including authentication, making requests, handling responses, and understanding rate limits.

You can find the OpenAI API documentation here: OpenAI API Documentation

Whether you're a beginner or an experienced developer, the OpenAI documentation provides insights and instructions that can help you effectively utilize their API in your projects.

About

Script is a voice recording and transcription tool that uses several libraries to record audio, transcribe it, and then paste the transcription into the current active window. The script is designed to be controlled through keyboard shortcuts, specifically using the Ctrl+R combination to start and stop recording.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages