DDSP_SoundSynthesis

The dataset consists only of solo flute and violin instrument samples from MusicNet. The license for the full dataset is held by MusicNet, and you can find more details on the MusicNet record page on Zenodo.

!tar -xvf musicnet_fluteviolin.tar.gz

Prepare your audio data to convert audio's sampling rate to 16k

ffmpeg -y -loglevel fatal -i $input_file -ac 1 -ar 16000 $output_file

Use CREPE to precalculate the (time, frequency, confidence) from your audio

crepe directory-to-audio/ --output directory-to-audio/f0_0.004/  --viterbi --step-size 4
# if it doesn't work please do with just crepe and your audio file path

Edit the config.yaml file to fit your needs (audio location, preprocess folder, sampling rate, model parameters...), then preprocess your data using

python preprocess.py
python train.py
python inference.py --input_wav {your input wave file path}

finally you will have 'recon_audio.wav' in your directory. If you succeed to timbre transfer your audio file, keep moving on for sound synthesis of the audio. keep going with your generation!

Sound Synthesis with Autoencoder

Please check if the audio file has been successfully converted using CREPE. If the conversion is complete, you are ready to begin.

python autoencoder/train.py --data_dir {your dataset path}
python autoencoder/inference.py --pt_dir {your model pt file path} --test_dir {your test dataset directory}

The result will be saved as a wave file named 'mixed_audio_{current time}'. Enjoy the transformation!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

DDSP_SoundSynthesis

Prepare your audio data to convert audio's sampling rate to 16k

Use CREPE to precalculate the (time, frequency, confidence) from your audio

Sound Synthesis with Autoencoder

Files

README.md

Latest commit

History

README.md

File metadata and controls

DDSP_SoundSynthesis

Prepare your audio data to convert audio's sampling rate to 16k

Use CREPE to precalculate the (time, frequency, confidence) from your audio

Sound Synthesis with Autoencoder