• Stars
    star
    114
  • Rank 308,031 (Top 7 %)
  • Language
  • License
    MIT License
  • Created over 7 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A Collection of Speech Corpus for ASR and TTS

Speech-Corpus-Collection

This repo is a collection of Speech Corpus for automatic speech recognition (ASR) and text-to-speech (TTS).

ASR Corpus

  1. VCTK
    Around 10.4GB. Alternative Host

  2. LibriSpeech
    Large-scale (1000 hours) corpus of read English speech.

  3. TEDLIUM release 2
    The TED-LIUM corpus was made from audio talks and their transcriptions available on the TED website. The authors have prepared and filtered these data in order to train acoustic models to participate to the International Workshop on Spoken Language Translation 2011 (the LIUM English/French SLT system reached the first rank in the SLT task).

TTS Corpus

  1. CMU ARCTIC Databases
    The databases consist of around 1150 utterances, including US English male (bdl) and female (slt) speakers, as well as other accented speakers.

  2. The World English Bible
    The World English Bible is a public domain update of the American Standard Version of 1901 into modern English. Its text and audio recordings are freely avaiable here. Unfortunately, however, each of the audio files matches a chapter, not a verse, so is too long in most cases. Kyubyong sliced them by verse manually. You can get them on his dropbox.

  3. Nancy Corpus
    The Nancy corpus from the 2011 Blizzard Challenge. The data is freely availiable for research use on the signing of a license.

General

  1. The NSynth Dataset
    NSynth is an audio dataset containing 305,979 musical notes, each with a unique pitch, timbre, and envelope. For 1,006 instruments from commercial sample libraries, we generated four second, monophonic 16kHz audio snippets, referred to as notes, by ranging over every pitch of a standard MIDI pian o (21-108) as well as five different velocities (25, 50, 75, 100, 127). The note was held for the first three seconds and allowed to decay for the final second.

Contact Me

Yunchao He
Weibo

More Repositories

1

Dialog_Corpus

用于训练中英文对话系统的语料库 Datasets for Training Chatbot System
Python
2,021
star
2

Chinsese_word_vectors

Chinsese_word_vectors
C
200
star
3

Griffin_lim

A TensorFlow implementation of Griffin-Lim algorithm
Python
77
star
4

AiVoice

Deep CNN networks for Speech Synthesis
Python
49
star
5

RawNet

RawNet: Fast End-to-End Neural Vocoder
42
star
6

Bots

Chatbot Framework for Chinese based on ChatScript 基于ChatScript的中文聊天引擎
C
41
star
7

CNTN

ChiNese Text Normalization (CNTN) tool for Text-to-speech system
Python
35
star
8

Ossian

Ossian: A simple language-independent Text-to-speech frontend
Python
17
star
9

ChatScript_DOC

A collection of document for ChatScript dialog engine
Batchfile
12
star
10

TensorFlow_Examples

This project use TensorFlow framework to do many interesting applications. Many popular deep leaning architecture will be implemented is this project, including Neural Networks, RNN, LSTM, Auto-encoder, CNN, etc.
Python
12
star
11

Alex

A Slot-filling based Dialog Manager for Task-oriented Bot
Python
11
star
12

SPExtractor

Tools for extract Speech parameters (lf0, mgc, bap) for TTS and wave restore.
Shell
5
star
13

texts_sentiment_analysis

texts sentiment analysis
Python
5
star
14

short_texts_sentiment_analysis

Short informal texts sentiment analysis
Python
5
star
15

ChatScript_Client

ChatScript Python Client
Python
3
star
16

TensorFlow_learn

Repo used for learning TensorFlow Framework
Python
3
star
17

Vecamend

Vecamend
Python
1
star
18

Ordinal_classification

Ordinal Classification of Tweets
Python
1
star
19

Concept_word_embeddings

Concept_word_embeddings
Python
1
star
20

T9Search

T9搜索
Java
1
star
21

Thesis_experiment

Thesis_experiment
Python
1
star
22

Vecamend-master2

more
Python
1
star