Discover WangHelin1997/Automatic_Speech_Annotator Open Source project

Stars
27
Rank 901,033 (Top 18 %)
Language
Python
Created 5 months ago
Updated 3 months ago

WangHelin1997

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Automatic speech annotator processing speech with voice activaty detection, overlapping speech detection, speaker diarization and automatic speech recognition

SpeechTasks

This is a list of speech tasks and datasets, which can provide training data for Generative AI, AIGC, AI model training, intelligent speech tool development, and speech applications.

MaskSpec

The Pytorch implementation of paper: Masked Spectrogram Prediction For Self-Supervised Audio Pre-Training

Python

DuTa-VC

Source code and demo for INTERPSEECH 2023 paper: DuTa-VC: A Duration-aware Typical-to-atypical Voice Conversion Approach with Diffusion Probabilistic Model

Python

SpecAugment-plus

A Pytorch implementation of the paper : SpecAugment++: A Hidden Space Data Augmentation Method for Acoustic Scene Classification

Python

nnAudio2

Audio processing by using pytorch 1D convolution network (based on nnAudio). Gammatone Spectrogram and SpecAugmentation are now available on GPU.

Python

DCASE-2020-Task1A-Code

A pytorch implementation of the paper : Acoustic Scene Classification with Multiple Decision Schemes.

Python

Fast-GeCo

Source code and demo for INTERSPEECH 2024 paper: Noise-robust Speech Separation with Fast Generative Correction

AT-GCN

Pytorch implementation of the paper : Modeling Label Dependencies for Audio Tagging with Graph Convolutional Network

Python

LibriLightMix-WHAMR

Python scripts to create noisy and reverberant 2-speaker mixture audio with Libri-Light and WHAM

Python

GL-AT

Pytorch implementation of the paper : A Global-local Attention Framework for Weakly Labelled Audio Tagging.

Python

Aty-TTS

Aty-TTS: Improving fairness for spoken language understanding in atypical speech with Text-to-Speech

Python

DCASE2021_Task6_PKU

This is the code of PKU team for DCASE 2021 Task 6.

Python

LibriLightMix-WHAM

Python scripts to create noisy mixture audio with Libri-Light and WHAM

Python

FPNet

A signal segmentation method of CNN for audio event classification

Python

CNN-model-and-visualization

A CNN model (RseNet) for image classification( CIFAR-10), including filter and output of layers visualization.

Python

Speech-paper-crawl

My Python scripts for crawling paper related on speech processing.

Python

Du-N2DVC-Demo

HTML

SCNN

SincConv layer using in AED and ASC

Python

Speech-Captioning-Dataset

Python

project2021

PKU team for 2021 project 'Guangchangwu detection'.

Python

DCASE2020-Task6-PKU

A Pytorch implementation of the DCASE2020 Task6 by PKU team : Automated Audio Captioning With Temporal Attention

Python

Babycry-sound-detection

PyTorch implementations of neural network models for Babycry sound detection, including training process and test demo. Based on DCASE2017 Task2: Detection of rare sound events.

CommonVoice

Aty-TTS-Demo

helinwang

Pytorch-audio_feature

Audio feature extraction in Pytorch module.

Python

dcase2019_1D

Dcase2019 Task1a using audio feature module.

Python

SSR-Speech-Demo

https://wanghelin1997.github.io/SSR-Speech-Demo/

JavaScript

WangHelin1997/Automatic_Speech_Annotator

WangHelin1997

Reviews

Repository Details

More Repositories