There are no reviews yet. Be the first to send feedback to the community and the maintainers!
Repository Details
A must-read paper for speech separation based on neural networks
A must-read paper and tutorial list for speech separation based on neural networks
This repository contains papers for pure speech separation and multimodal speech separation.
By Kai Li (if you have any suggestions, please contact me! Email: [email protected]).
Tip: For speech separation beginners, I recommend you to read "deep clustering" & "PIT&uPIT" works which will help understand the problem.
If you have found the code for some of the articles below, welcome to add links.
‼️ New board: New papers are introduced every week ! Weekly_Report.md
Pure Speech Separation
✔️ [Joint Optimization of Masks and Deep Recurrent Neural Networks for Monaural Source Separation, Po-Sen Huang, TASLP 2015] [Paper][Code (posenhuang)]
✔️ [Complex Ratio Masking for Monaural Speech Separation, DS Williamson, TASLP 2015] [Paper]
✔️ [Multitalker speech separation with utterance-level permutation invariant training of deep recurrent neural networks, M Kolbæk, TASLP 2017] [Paper][Code (Kai Li)]
✔️ [Deep attractor network for single-microphone speaker separation, Zhuo Chen, ICASSP 2017] [Paper][Code (Kai Li)]
✔️ [A consolidated perspective on multi-microphone speech enhancement and source separation, Sharon Gannot, TASLP 2017] [Paper]
✔️ [Alternative Objective Functions for Deep Clustering, Zhong-Qiu Wang, ICASSP 2018] [Paper]
✔️ [End-to-End Speech Separation with Unfolded Iterative Phase Reconstruction Zhong-Qiu Wang et al. 2018] [Paper]
✔️ [Speaker-independent Speech Separation with Deep Attractor Network, Luo Yi, TASLP 2018] [Paper][Code (Kai Li)]
✔️ [Tasnet: time-domain audio separation network for real-time, single-channel speech separation, Luo Yi, ICASSP 2018] [Paper][Code (Kai Li)][Code (asteroid)]
✔️ [Supervised Speech Separation Based on Deep Learning An Overview, DeLiang Wang, Arxiv 2018] [Paper]
✔️ [An Overview of Lead and Accompaniment Separation in Music, Zafar Rafi, TASLP 2018] [Paper]
✔️ [SAGRNN: Self-Attentive Gated RNN for Binaural Speaker Separation with Interaural Cue Preservation, Ke Tan, IEEE Signal Processing Letters] [Paper]
✔️ [A convolutional recurrent neural network with attention framework for speech separation in monaural recordings, Chao Sun, Scientific Reports] [Paper]
✔️ [Unsupervised Sound Separation Using Mixture Invariant Training, Scott Wisdom, NeurIPS 2020] [Paper]
✔️ [Causal Deep CASA for Monaural Talker-Independent Speaker Separation, Yuzhou Liu, TASLP 2019] [Paper]
✔️ [Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation, Scott Wisdom, Arxiv 2021] [Paper]
✔️ [Tune-In: Training Under Negative Environments with Interference for Attention Networks Simulating Cocktail Party Effect, Jun Wang, Arxiv 2021] [Paper]
✔️ [Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network, Kai Li, NeuralPS 2021] [Paper][Code]
Multi-Model Speech Separation
✔️ [Audio-Visual Speech Enhancement Using Multimodal Deep Convolutional Neural Networks, Jen-Cheng Hou, TETCI 2017] [Paper][Code]
✔️ [The Sound of Pixels, Hang Zhao, ECCV 2018] [Paper][Code]
✔️ [Looking to Listen at the Cocktail Party: A Speaker-Independent Audio-Visual Model for Speech Separation, ARIEL EPHRAT, ACM Transactions on Graphics 2018] [Paper][Code]
✔️ [Learning to Separate Object Sounds by Watching Unlabeled Video, Ruohan Gao, ECCV 2018] [Paper]
I may not be able to get all the articles completely. So if you have an excellent essay or tutorial, you can update it in my format. At the same time, if you think the repository meets your needs, please give a star or fork, thank you.