Audio-classification-with-Pytorch

This repo demonstrates a basic, widely used approach to classify audio signals.

Step1: Build a preprocess data pipeline.

I resampled it to 16Khz and just take 48000 samples(3s) to feed to our model because I think 3s is long enough to recognise a sound. And then, I use torchaudio to extract the mel points. You can gain more intuition in this link: https://haythamfayek.com/2016/04/21/speech-processing-for-machine-learning.html

Step2: Visualize some random samples in our dataset (as always)

Step3: Build a mini VGG-19 like CNN model(where conv layer sizes gradually get smaller)

Step4: Train the model using Adam optimizer and CE Loss. P/s: Due to limit computational resources. I'll stop here

simplescreenrecorder-2023-10-17_21.25.38.mp4

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
Audio Classification.ipynb		Audio Classification.ipynb
README.md		README.md
requirements.txt		requirements.txt

Provide feedback