Add VC Noro model #247

kenxxxxx · 2024-07-18T12:00:50Z

✨ Description

In this PR, we release an unofficial PyTorch implementation of Noro, a Noise-Robust One-shot Voice Conversion (VC) system. This model is designed to convert the timbre of speech from a source speaker to a target speaker using only a single reference speech sample while preserving the semantic content of the original speech. Noro introduces innovative components tailored for VC using noisy reference speeches, including a dual-branch reference encoding module and a noise-agnostic contrastive speaker loss.

The main purpose of this PR is to provide a noise-robust VC solution that performs effectively even with noisy reference speeches, making it suitable for real-world applications. Additionally, we explore the hidden speaker representation capabilities of the VC system by repurposing its reference encoder as a speaker encoder, demonstrating competitive performance with advanced self-supervised learning models.

To test this PR, follow the instructions in the updated README.md to set up the environment, train the model, and evaluate its performance under different acoustic environments.

🚧 Related Issues

None

👨‍💻 Changes Proposed

Implemented the Noro model with a dual-branch reference encoding module.
Added the training and evaluation scripts for the Noro model.
Added detailed documentation and examples for training and testing the model.

🧑‍🤝‍🧑 Who Can Review?

@RMSnow @HarryHe11 @Adorable-Qin

✅ Checklist

Code has been reviewed
Code complies with the project's code standards and best practices
Code has passed all tests
Code does not affect the normal use of existing features
Code has been commented properly
Documentation has been updated (if applicable)
Demo/checkpoint has been attached (if applicable)

Add news

…_noro

RMSnow

Thanks for your efforts! Great job! This is our first time to introduce VC. So let us set high criteria for future developers!

README.md

config/vc.json

egs/vc/README.md

egs/vc/exp_config_4gpu_clean.json

models/base/vc_dataset.py

models/vc/ns2_uniamphion.py

models/vc/vc_loss.py

models/vc/vc_trainer.py

models/vc/vc_utils.py

HarryHe11 · 2024-07-29T05:37:41Z

@RMSnow Thank you, Xueyao, for your detailed comments! @kenxxxxx Yuchen, please familiarize yourself with Git-based development and directly update your code on your fork so we can track your revision progress.

RMSnow · 2024-10-08T16:29:39Z

egs/vc/README.md

@@ -0,0 +1,20 @@
+# Amphion Singing Voice Cloning (VC) Recipe


Voice Conversion Recipe

RMSnow · 2024-10-08T16:29:53Z

egs/vc/README.md

+
+## Quick Start
+
+We provide a **[beginner recipe](Noro)** to demonstrate how to train a cutting edge SVC model. Specifically, it is an official implementation of the paper "NORO: A Noise-Robust One-Shot Voice Conversion System with Hidden Speaker Representation Capabilities".


Typo: "SVC model"

RMSnow · 2024-10-08T16:30:55Z

models/tts/naturalspeech2/ns2_trainer.py

Will this change effect the ns2, TTS model?

RMSnow · 2024-10-08T16:32:43Z

BTW, use black to format the code to pass the format check

kenxxxxx and others added 6 commits July 18, 2024 14:30

Initial commit

23d2df7

successful run code

7333555

Create README.md

138b6a1

Update README.md

4a484fb

Add news

Code cleaned

070df5b

Merge branch 'vc_noro' of https://github.com/kenxxxxx/Amphion into vc…

099052a

…_noro

HarryHe11 requested review from HarryHe11 and RMSnow July 19, 2024 01:28

RMSnow requested changes Jul 19, 2024

View reviewed changes

kenxxxxx and others added 11 commits August 28, 2024 23:31

code clean

962dd27

add image

5cf2a4a

Update README.md

ab2553c

Create README.md

868ef31

Update README.md

9e0eba2

code update

0c34c69

README.md conflict resolved

546e3b7

README.md conflict

ff5bb74

README.md conflict resolved

5d1bbdd

Update README.md

799bec0

Update README.md

0d05a37

kenxxxxx requested a review from RMSnow September 25, 2024 15:24

RMSnow requested changes Oct 8, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add VC Noro model #247

Add VC Noro model #247

kenxxxxx commented Jul 18, 2024

RMSnow left a comment

HarryHe11 commented Jul 29, 2024

RMSnow Oct 8, 2024

RMSnow Oct 8, 2024

RMSnow Oct 8, 2024

RMSnow commented Oct 8, 2024


		## Quick Start

		We provide a [beginner recipe](Noro) to demonstrate how to train a cutting edge SVC model. Specifically, it is an official implementation of the paper "NORO: A Noise-Robust One-Shot Voice Conversion System with Hidden Speaker Representation Capabilities".

Add VC Noro model #247

Are you sure you want to change the base?

Add VC Noro model #247

Conversation

kenxxxxx commented Jul 18, 2024

✨ Description

🚧 Related Issues

👨‍💻 Changes Proposed

🧑‍🤝‍🧑 Who Can Review?

✅ Checklist

RMSnow left a comment

Choose a reason for hiding this comment

HarryHe11 commented Jul 29, 2024

RMSnow Oct 8, 2024

Choose a reason for hiding this comment

RMSnow Oct 8, 2024

Choose a reason for hiding this comment

RMSnow Oct 8, 2024

Choose a reason for hiding this comment

RMSnow commented Oct 8, 2024