Skip to content

Speech enhancement network based on STOI score loss function

Notifications You must be signed in to change notification settings

Reagan1947/STOI-Enhance-Net

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 

Repository files navigation


STOI-Enhance-Net_icon
STOI-Enhance-Net

Speech Enhancement Network Based On STOI Score Loss Function
基于STOI语音评价指标损失函数的语音增强网络

1. Dataset 数据集

Microsoft Scalable Noisy Speech Dataset (MS-SNSD)
  • This dataset contains a large collection of clean speech files and variety of environmental noise files in .wav format sampled at 16 kHz.

  • The main application of this dataset is to train Deep Neural Network (DNN) models to suppress background noise. But it can be used for other audio and speech applications.

  • We provide the recipe to mix clean speech and noise at various signal to noise ratio (SNR) conditions to generate large noisy speech dataset.

  • The SNR conditions and the number of hours of data required can be configured depending on the application requirements.

  • More Infor: https://github.com/microsoft/MS-SNSD

2. Based on 基于

STOI

Existing objective speech-intelligibility measures are suitable for several types of degradation, however, it turns out that they are less appropriate for methods where noisy speech is processed by a time-frequency (TF) weighting, e.g., noise reduction and speech separation. In this paper, we present an objective intelligibility measure, which shows high correlation (rho=0.95) with the intelligibility of both noisy, and TF-weighted noisy speech. The proposed method shows significantly better performance than three other, more sophisticated, objective measures. Furthermore, it is based on an intermediate intelligibility measure for short-time (approximately 400 ms) TF-regions, and uses a simple DFT-based TF-decomposition. In addition, a free Matlab implementation is provided.

More Infor: https://ieeexplore.ieee.org/document/5495701

IDSEGAN

This is the repository of the DSEGAN, ISEGAN, (and the baseline SEGAN) in our original paper:

H. Phan, I. V. McLoughlin, L. Pham, O. Y. Chén, P. Koch, M. De Vos, and A. Mertins, "Improving GANs for Speech Enhancement," IEEE Signal Processing Letters, 2020. (accepted)

More Infor: https://github.com/pquochuy/idsegan

3. Statue 项目进度

  1. Dataset Preparation 数据集准备

4. Dependencies 依赖

  • tensorflow_gpu == 1.9
  • numpy== 1.1.3
  • scipy== 1.0.0

About

Speech enhancement network based on STOI score loss function

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published