• Stars
    star
    473
  • Rank 92,832 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created about 7 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Different implementations of "Weighted Prediction Error" for speech dereverberation

nara_wpe

Documentation Status Travis Status PyPI PyPI MIT License

Weighted Prediction Error for speech dereverberation

Background noise and signal reverberation due to reflections in an enclosure are the two main impairments in acoustic signal processing and far-field speech recognition. This work addresses signal dereverberation techniques based on WPE for speech recognition and other far-field applications. WPE is a compelling algorithm to blindly dereverberate acoustic signals based on long-term linear prediction.

The main algorithm is based on the following paper: Yoshioka, Takuya, and Tomohiro Nakatani. "Generalization of multi-channel linear prediction methods for blind MIMO impulse response shortening." IEEE Transactions on Audio, Speech, and Language Processing 20.10 (2012): 2707-2720.

Content

  • Iterative offline WPE/ block-online WPE/ recursive frame-online WPE
  • All algorithms implemented both in Numpy and in TensorFlow (works with version 1.12.0).
  • Continuously tested with Python 3.7, 3.8, 3.9 and 3.10.
  • Automatically built documentation: nara-wpe.readthedocs.io
  • Modular design to facilitate changes for further research

Installation

Install it directly with Pip, if you just want to use it:

pip install nara_wpe

If you want to make changes or want the most recent version: Clone the repository and install it as follows:

git clone https://github.com/fgnt/nara_wpe.git
cd nara_wpe
pip install --editable .

Check the example notebook for further details. If you download the example notebook, you can listen to the input audio examples and to the dereverberated output too.

Citation

To cite this implementation, you can cite the following paper:

@InProceedings{Drude2018NaraWPE,
  Title     = {{NARA-WPE}: A Python package for weighted prediction error dereverberation in {Numpy} and {Tensorflow} for online and offline processing},
  Author    = {Drude, Lukas and Heymann, Jahn and Boeddeker, Christoph and Haeb-Umbach, Reinhold},
  Booktitle = {13. ITG Fachtagung Sprachkommunikation (ITG 2018)},
  Year      = {2018},
  Month     = {Oct},
}

To view the paper see IEEE Xplore (PDF) or for a preview see Paderborn University RIS (PDF).

Comparision with the NTT WPE implementation

The fairly recent John Hopkins University paper (Manohar, Vimal: Acoustic Modeling for Overlapping Speech Recognition: JHU CHiME-5 Challenge System, ICASSP 2019) reporting on their CHiME 5 challenge results dedicate an entire table to the comparison of the Nara-WPE implementation and the NTT WPE implementation. Their result is, that the Nara-WPE implementation is as least as good as the NTT WPE implementation in all their reported conditions.

Development history

Since 2017-09-05 a TensorFlow implementation has been added to nara_wpe. It has been tested with a few test cases against the Numpy implementation.

The first version of the Numpy implementation was written in June 2017 while Lukas Drude and Kateřina Žmolíková resided in Nara, Japan. The aim was to have a publicly available implementation of Takuya Yoshioka's 2012 paper.

More Repositories

1

pb_bss

Collection of EM algorithms for blind source separation of audio signals
Python
264
star
2

nn-gev

Neural network supported GEV beamformer
Python
191
star
3

pb_chime5

Speech enhancement system for the CHiME-5 dinner party scenario
Python
108
star
4

sms_wsj

SMS-WSJ: Spatialized Multi-Speaker Wall Street Journal database for multi-channel source separation and recognition
Python
101
star
5

meeteval

MeetEval - A meeting transcription evaluation toolkit
Python
72
star
6

padertorch

A collection of common functionality to simplify the design, training and evaluation of machine learning models based on pytorch with an emphasis on speech processing.
Python
71
star
7

pb_sed

Paderborn Sound Event Detection
Python
68
star
8

ci_sdr

Python
51
star
9

mms_msg

Multipurpose Multi Speaker Mixture Signal Generator
Python
42
star
10

paderbox

Paderbox: A collection of utilities for audio / speech processing
Python
37
star
11

graph_pit

Python
32
star
12

sed_scores_eval

Python
26
star
13

lazy_dataset

lazy_dataset: Process large datasets as if it was an iterable.
Python
17
star
14

LatticeWordSegmentation

Software to apply unsupervised word segmentation on lattices or text sequences using a nested hierarchical Pitman Yor language model
C++
17
star
15

paderwasn

Paderwasn is a collection of methods for acoustic signal processing in wireless acoustic sensor networks (WASNs).
Python
13
star
16

nhpylm

Python bindings for a c++ based implementation of the Nested Hierarchical Pitman-Yor Language model
C++
13
star
17

sins

Python
8
star
18

python_crashkurs

Jupyter Notebook
7
star
19

oaf

Jupyter notebooks for the lecture "Optimal and adaptive filters"
Jupyter Notebook
7
star
20

mnist

Makefile
6
star
21

dlp_mpi

Python
5
star
22

nachrichtentechnik

Jupyter noteboooks for the lecture "Nachrichtentechnik" (communications engineering) with explanations in german.
Jupyter Notebook
4
star
23

libriwasn

Tools and scripts for the LibriWASN data set from zenodo
Python
3
star
24

ham_radio

Python
3
star
25

speaker_reassignment

Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment
Python
3
star
26

upb_audio_tagging_2019

UPB system for the Kaggle competition "Freesound Audio Tagging 2019"
Python
2
star
27

asnsig

ASNSIG – A Signal Generator for Ad-Hoc Acoustic Sensor Networks in Smart Home Environments
Python
2
star
28

2019_ad_xidian

HTML
1
star