DPCRN_DNS3
Created on Mon Oct 28 16:05:31 2021
@author: xiaohuai.le
This repository is the official implementation of paper "DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement". This work got the third place in Deep Noise Suppression Challenge.
Requirements
tensorflow>=1.14,
numpy,
matplotlib,
librosa,
sondfile.
Datasets
We use Deep Noise Suppression Dataset and OpenSLR26, OpenSLR28 RIRs dataset in our training and validation stages. The directory structure of the dataset is shown below:
dataset
โโโ clean
โ โโโ audio1.wav
โ โโโ audio2.wav
โ โโโ audio3.wav
โ ...
โโโ noise
โ โโโ audio1.wav
โ โโโ audio2.wav
โ โโโ audio3.wav
โ ...
RIR
โโโ rirs
โ โโโ rir1.wav
โ โโโ rir2.wav
โ โโโ rir3.wav
โ ...
Training and test
Run the following code to training:
python main.py --mode train --cuda 0 --experimentName experiment_1
Run the following code to test the model on a single file:
python main.py --mode test --test_dir the_dir_of_noisy --output_dir the_dir_of_enhancement_results
More samples
The final results on the blind test set of DNS3 is available on https://github.com/Le-Xiaohuai-speech/DPCRN_DNS3_Results.
Real-time inference
Note that the real-time inference can only run on the tensorflow=1.x.
Run real-time inference to calculate the time cost of a frame:
python ./real_time_processing/real_time_DPCRN.py
Tensorflow Lite quantization and pruning
The TFLite file of a smaller dpcrn model is uploaded. Enhance a single wav file:
python ./inference/real_time_inference/inference.py
Streaming recording and enhancement:
python ./inference/real_time_inference/recording.py
Citations
@inproceedings{le21b_interspeech,
author={Xiaohuai Le and Hongsheng Chen and Kai Chen and Jing Lu},
title={{DPCRN: Dual-Path Convolution Recurrent Network for Single Channel Speech Enhancement}},
year=2021,
booktitle={Proc. Interspeech 2021},
pages={2811--2815},
doi={10.21437/Interspeech.2021-296}
}