• Stars
    star
    410
  • Rank 105,468 (Top 3 %)
  • Language
    Python
  • Created almost 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement

Conv-TasNet

‼️new‼️: The modified training and testing code is now able to separate speech properly.

‼️new‼️: Updated model code, added code for skip connection section.

‼️notice‼️: Training Batch size setting 8/16

‼️notice‼️: The implementation of another article optimizing Conv-TasNet has been open sourced in "Deep-Encoder-Decoder-Conv-TasNet".

Demo Pages: Results of pure speech separation model

Conv-TasNet: Surpassing Ideal Time-Frequency Magnitude Masking for Speech Separation Pytorch's Implement

Luo Y, Mesgarani N. Conv-TasNet: Surpassing Ideal Time–Frequency Magnitude Masking for Speech Separation[J]. IEEE/ACM Transactions on Audio, Speech, and Language Processing, 2019, 27(8): 1256-1266.

GitHub issues GitHub forks GitHub stars Twitter

Requirement

  • Pytorch 1.3.0
  • TorchAudio 0.3.1
  • PyYAML 5.1.2

Accomplished goal

  • Support Multi-GPU Training, you can see the train.yml
  • Use the Dataloader Method That Comes With Pytorch
  • Provide Pre-Training Models

Preparation files before training

  1. Generate dataset using create-speaker-mixtures.zip with WSJ0 or TIMI
  2. Generate scp file using script file of create_scp.py

Training this model

  • If you want to adjust the network parameters and the path of the training file, please modify the option/train/train.yml file.
  • Training Command
    python train.py ./option/train/train.yml

Inference this model

  • Inference Command (Use this command if you need to test a large number of audio files.)

    python Separation.py -mix_scp 1.scp -yaml ./config/train/train.yml -model best.pt -gpuid [0,1,2,3,4,5,6,7] -save_path ./checkpoint
  • Inference Command (Use this command if you need to test a single audio files.)

    python Separation_wav.py -mix_wav 1.wav -yaml ./config/train/train.yml -model best.pt -gpuid [0,1,2,3,4,5,6,7] -save_path ./checkpoint

Results

  • Currently training, the results will be displayed when the training is over.
  • The following table is the experimental results of different parameters in the paper
N L B H Sc P X R Normalization Causal Receptive field Model Size SI-SNRi SDRi
128 40 128 256 128 3 7 2 gLN x 1.28 1.5M 13.0 13.3
256 40 128 256 128 3 7 2 gLN x 1.28 1.5M 13.1 13.4
512 40 128 256 128 3 7 2 gLN x 1.28 1.7M 13.3 13.6
512 40 128 256 256 3 7 2 gLN x 1.28 2.4M 13.0 13.3
512 40 128 512 128 3 7 2 gLN x 1.28 3.1M 13.3 13.6
512 40 128 512 512 3 7 2 gLN x 1.28 6.2M 13.5 13.8
512 40 256 256 256 3 7 2 gLN x 1.28 3.2M 13.0 13.3
512 40 256 512 256 3 7 2 gLN x 1.28 6.0M 13.4 13.7
512 40 256 512 512 3 7 2 gLN x 1.28 8.1M 13.2 13.5
512 40 128 512 128 3 6 4 gLN x 1.27 5.1M 14.1 14.4
512 40 128 512 128 3 4 6 gLN x 0.46 5.1M 13.9 14.2
512 40 128 512 128 3 8 3 gLN x 3.83 5.1M 14.5 14.8
512 32 128 512 128 3 8 3 gLN x 3.06 5.1M 14.7 15.0
512 16 128 512 128 3 8 3 gLN x 1.53 5.1M 15.3 15.6
512 16 128 512 128 3 8 3 cLN 1.53 5.1M 10.6 11.0

Pre-Train Model

‼️new‼️: Huggingface Pretrain Google Driver

Our Results Image

Reference

More Repositories

1

Speech-Separation-Paper-Tutorial

A must-read paper for speech separation based on neural networks
732
star
2

Dual-Path-RNN-Pytorch

Dual-path RNN: efficient long sequence modeling for time-domain single-channel speech separation implemented by Pytorch
Python
404
star
3

TDANet

An efficient speech separation method
Python
218
star
4

Looking-to-Listen-at-the-Cocktail-Party

Executable code based on Google articles
Python
162
star
5

AFRCNN-For-Speech-Separation

Speech Separation Using an Asynchronous Fully Recurrent Convolutional Neural Network
Python
134
star
6

LibriSpace

Python
130
star
7

Deep-Clustering-for-Speech-Separation

Pytorch implements Deep Clustering: Discriminative Embeddings For Segmentation And Separation
Python
121
star
8

SPMamba

Python
111
star
9

IIANet

This is the demo of our paper "IIANet: An Intra- and Inter-Modality Attention Network for Audio-Visual Speech Separation".
Python
107
star
10

Calculate-SNR-SDR

Script to calculate SNR and SDR using python
Python
86
star
11

LRS3-For-Speech-Separation

Multi-modal speech separation task data generation script on LRS3 data set.
MATLAB
75
star
12

CTCNet

An Audio-Visual Speech Separation Model Inspired by Cortico-Thalamo-Cortical Circuits
Python
65
star
13

UtterancePIT-Speech-Separation

According to funcwj's uPIT, the training code supporting multi-gpu is written, and the Dataloader is reconstructed.
Python
64
star
14

AV-ConvTasNet

Unofficial Time Domain Audio Visual Speech Separation Implementation
Python
44
star
15

Deep-Encoder-Decoder-Conv-TasNet

A PyTorch implementation of " AN EMPIRICAL STUDY OF CONV-TASNET "
Python
43
star
16

DANet-For-Speech-Separation

Pytorch implement of DANet For Speech Separation
Python
20
star
17

S4M

Official implementation of Efficient Speech Separation Framework Based on Neural State-Space Models
Python
16
star
18

Look2hear

A toolkit for researchers in the multimodal sound separation.
15
star
19

speechbrain-docs-zh-cn

SpeechBrain中文文档
12
star
20

Arxiv-New-Paper-Server

Arxiv automatically obtains the latest article service.
CSS
11
star
21

My-Script-For-Audio-Process

Some convenient scripts for your own use
Jupyter Notebook
10
star
22

ExamOnline

This is a complete online exam system
Java
10
star
23

Apollo

Music repair method to convert lossy MP3 compressed music to lossless music.
Python
9
star
24

WeChatApp

Complete code of WeChat Mini Program
JavaScript
8
star
25

player

Android Homework(3)
Java
7
star
26

GrabCut

C++
7
star
27

Grass

Python
7
star
28

ELF-SR

Python
7
star
29

Time

My Android Project
Java
7
star
30

Accelerator

Openmp Accelerator
Python
7
star
31

Deep-Learning

Learn to deep learning the code of your own records.
Python
7
star
32

JusperLee

7
star
33

jusperlee.github.io

HTML
5
star
34

Souhu-Competition-Dazuoye

Python
3
star
35

TFACM

HTML
3
star
36

BigData-Homework-Yanwaizhiyi

HTML
2
star
37

RTFS-Net

HTML
2
star
38

Deep-learning-course

Store some necessary files
1
star
39

audio-paper-daily

1
star