• Stars
    star
    260
  • Rank 156,242 (Top 4 %)
  • Language
    Python
  • Created about 5 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A minimum unofficial implementation of the "A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement" (CRN) using PyTorch

A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement

A minimum unofficial implementation of the A Convolutional Recurrent Neural Network for Real-Time Speech Enhancement (CRN) using PyTorch.

ToDo

  • Real-time version
  • Update trainer
  • Visualization of the spectrogram and the metrics (PESQ, STOI, SI-SDR) in the training
  • More docs

Usage

Training:

python train.py -C config/train/baseline_model.json5

Inference:

python inference.py \
    -C config/inference/basic.json5 \
    -cp ~/Experiments/CRN/baseline_model/checkpoints/latest_model.tar \
    -dist ./enhanced

Check out the README of Wave-U-Net for SE to learn more.

Performance

PESQ, STOI, SI-SDR on DEMAND - Voice Bank test dataset, for reference only:

Experiment PESQ SI-SDR STOI
Noisy 1.979 8.511 0.9258
CRN 2.528 17.71 0.9325
CRN signal approximation 2.606 17.84 0.9382

Dependencies

  • Python==3.*.*
  • torch==1.*
  • librosa==0.7.0
  • tensorboard
  • pesq
  • pystoi
  • matplotlib
  • tqdm

References

More Repositories

1

Wave-U-Net-for-Speech-Enhancement

Implement Wave-U-Net by PyTorch, and migrate it to the speech enhancement.
Python
284
star
2

IRM-based-Speech-Enhancement-using-LSTM

Ideal Ratio Mask (IRM) Estimation based Speech Enhancement using LSTM
Python
103
star
3

spiking-fullsubnet

Official repository of Spiking-FullSubNet, the Intel N-DNS Challenge Algorithmic Track Winner.
Python
56
star
4

SNR-Based-Progressive-Learning-of-Deep-Neural-Network-for-Speech-Enhancement

Implementation of the paper "SNR-Based Progressive Learning of Deep Neural Network for Speech Enhancement."
Python
41
star
5

SpEx

Implementation of "SpEx: Multi-Scale Time Domain Speaker Extraction Network".
Python
29
star
6

Build-SE-Dataset

Build speech enhancement dataset.
Python
24
star
7

llm-tse

Typing to Listen at the Cocktail Party: Text-Guided Target Speaker Extraction (LLM-TSE)
JavaScript
21
star
8

UNetGAN-Demo

[INTERSPEECH 2019] Waiting Update! This project is a demonstration of the paper UNetGAN: A Robust Speech Enhancement Approach in Time Domain for Extremely Low Signal-to-noise Ratio Condition.
JavaScript
19
star
9

audioinfo

A small tool to calculate the distribution of audio durations in a directory
Python
11
star
10

MetricGAN-PyTorch

MetricGAN with PyTorch, currently in progress.
Python
8
star
11

Masking-and-Inpainting

This link gives you access to the Demo page:
3
star
12

gatsbyblog

base on gatsby.js
JavaScript
1
star
13

Extremely-Low-SNR-Demo

[❌ Deprecated] A demonstration of the paper A robust speech enhancement approach based on deep adversarial learning for extremely low signal-to-noise condition.
JavaScript
1
star