There are no reviews yet. Be the first to send feedback to the community and the maintainers!
ast
Code for the Interspeech 2021 paper "AST: Audio Spectrogram Transformer".ssast
Code for the AAAI 2022 paper "SSAST: Self-Supervised Audio Spectrogram Transformer".ltu
Code, Dataset, and Pretrained Models for Audio and Speech Large Language Model "Listen, Think, and Understand".whisper-at
Code and Pretrained Models for Interspeech 2023 Paper "Whisper-AT: Noise-Robust Automatic Speech Recognizers are Also Strong Audio Event Taggers"cav-mae
Code and Pretrained Models for ICLR 2023 Paper "Contrastive Audio-Visual Masked Autoencoder".gopt
Code for the ICASSP 2022 paper "Transformer-Based Multi-Aspect Multi-Granularity Non-native English Speaker Pronunciation Assessment".psla
Code for the TASLP paper "PSLA: Improving Audio Tagging With Pretraining, Sampling, Labeling, and Aggregation".vocalsound
Dataset and baseline code for the VocalSound dataset (ICASSP2022).uavm
Code for the IEEE Signal Processing Letters 2022 paper "UAVM: Towards Unifying Audio and Visual Models".python-compute-eer
Simple Python script to compute equal error rate (EER) for machine learning model evaluation.ReMASC
ReMASC: Realistic Replay Attack Corpus for Voice Controlled Systemsrealtime-adversarial-attack
Code for IJCAI 2019 paper "Real-time Adversarial Attack".llm_speech_emotion_challenge
multichannel-antispoof
Code for SPL paper "Detecting Replay Attacks Using Multi-Channel Audio: A Neural Network-Based Method"efficient-voice-antispoof
kaldi-abbr
kaldi name convention noteSpeech_DB_Engine
Love Open Source and this site? Check out how you can help us