Skip to content

A scalable Python module for robust audio transcription using OpenAI's Whisper model with power of GROQ CLOUD. Supports multiple languages, batch processing, and output formats like JSON and SRT.

Notifications You must be signed in to change notification settings

Arslanex/Groq-Whisper-Transcriber

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Groq Whisper Transcription Demo

Overview

This project provides a robust audio transcription module using the Groq Cloud API and Whisper models. It includes comprehensive audio file validation, error handling, and flexible transcription options.

Features

  • Multiple Whisper model support
  • Strict audio file validation
  • Detailed error handling
  • Command-line interface for easy transcription

Prerequisites

  • Python 3.8+
  • Groq Cloud API Key

Installation

  1. Clone the repository
  2. Create a virtual environment
python3 -m venv venv
source venv/bin/activate
  1. Install dependencies
pip install -r requirements.txt
  1. Set up your Groq API Key
# Create a .env file in the project root
echo "GROQ_API_KEY=your_api_key_here" > .env

Usage

Command Line

python transcribe.py /path/to/audio/file.mp3

In Python Script

from groq_transcriber import GroqTranscriber

transcriber = GroqTranscriber(model="whisper-large-v3")
result = transcriber.transcribe("audio.mp3")
print(result)

Supported Audio Formats

  • MP3
  • MP4
  • MPEG
  • M4A
  • WAV
  • WebM

Limitations

  • Maximum file size: 25 MB
  • Minimum file length: 0.01 seconds
  • Minimum billed length: 10 seconds

Troubleshooting

  • Ensure your API key is valid
  • Check audio file format and size
  • Verify network connectivity

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss proposed changes.

License

MIT

About

A scalable Python module for robust audio transcription using OpenAI's Whisper model with power of GROQ CLOUD. Supports multiple languages, batch processing, and output formats like JSON and SRT.

Resources

Stars

Watchers

Forks

Languages