fgnt/pb_bss

Stars
264
Rank 155,103 (Top 4 %)
Language
Python
License
MIT License
Created over 7 years ago
Updated 4 months ago

fgnt/pb_bss

fgnt

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Collection of EM algorithms for blind source separation of audio signals

Blind Source Separation (BSS) algorithms

This repository covers EM algorithms to separate speech sources in multi-channel recordings.

In particular, the repository contains methods to integrate Deep Clustering (a neural network-based source separation algorithm) with a probabilistic spatial mixture model as proposed in the Interspeech paper "Tight integration of spatial and spectral features for BSS with Deep Clustering embeddings" presented at Interspeech 2017 in Stockholm.

@InProceedings{Drude2017DeepClusteringIntegration,
  Title                    = {Tight integration of spatial and spectral features for {BSS} with Deep Clustering embeddings},
  Author                   = {Drude, Lukas and and Haeb-Umbach, Reinhold},
  Booktitle                = {INTERSPEECH 2017, Stockholm, Sweden},
  Year                     = {2017},
  Month                    = {Aug}
}

Installation

Install it directly from source

git clone https://github.com/fgnt/pb_bss.git
cd pb_bss
pip install --editable .

We expect that numpy, scipy and cython are installed (e.g. conda install numpy scipy cython or pip install numpy scipy cython).

The default option is to install only the necessary dependencies. When you want to run the tests or execute the notebooks, use the one of the following commands for the installation:

pip install --editable .[all]  # Without a whitespace between `.` and `[all]`
pip install git+https://github.com/fgnt/pb_bss.git#egg=pb_bss[all]

nara_wpe

Different implementations of "Weighted Prediction Error" for speech dereverberation

nn-gev

Neural network supported GEV beamformer

pb_chime5

Speech enhancement system for the CHiME-5 dinner party scenario

sms_wsj

SMS-WSJ: Spatialized Multi-Speaker Wall Street Journal database for multi-channel source separation and recognition

meeteval

MeetEval - A meeting transcription evaluation toolkit

padertorch

A collection of common functionality to simplify the design, training and evaluation of machine learning models based on pytorch with an emphasis on speech processing.

pb_sed

Paderborn Sound Event Detection

ci_sdr

mms_msg

Multipurpose Multi Speaker Mixture Signal Generator

paderbox

Paderbox: A collection of utilities for audio / speech processing

graph_pit

sed_scores_eval

lazy_dataset

lazy_dataset: Process large datasets as if it was an iterable.

LatticeWordSegmentation

Software to apply unsupervised word segmentation on lattices or text sequences using a nested hierarchical Pitman Yor language model

paderwasn

Paderwasn is a collection of methods for acoustic signal processing in wireless acoustic sensor networks (WASNs).

nhpylm

Python bindings for a c++ based implementation of the Nested Hierarchical Pitman-Yor Language model

sins

python_crashkurs

Jupyter Notebook

oaf

Jupyter notebooks for the lecture "Optimal and adaptive filters"

Jupyter Notebook

mnist

dlp_mpi

nachrichtentechnik

Jupyter noteboooks for the lecture "Nachrichtentechnik" (communications engineering) with explanations in german.

Jupyter Notebook

libriwasn

Tools and scripts for the LibriWASN data set from zenodo

ham_radio

speaker_reassignment

Once more Diarization: Improving meeting transcription systems through segment-level speaker reassignment

upb_audio_tagging_2019

UPB system for the Kaggle competition "Freesound Audio Tagging 2019"

asnsig

ASNSIG – A Signal Generator for Ad-Hoc Acoustic Sensor Networks in Smart Home Environments

2019_ad_xidian