• Stars
    star
    270
  • Rank 147,030 (Top 3 %)
  • Language
  • Created about 7 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus.

The DARPA TIMIT Acoustic-Phonetic Continuous Speech Corpus

The TIMIT corpus of read speech has been designed to provide speech data for the acquisition of acoustic-phonetic knowledge and for the development and evaluation of automatic speech recognition systems. TIMIT has resulted from the joint efforts of several sites under sponsorship from the Defense Advanced Research Projects Agency - Information Science and Technology Office (DARPA-ISTO). Text corpus design was a joint effort among the Massachusetts Institute of Technology (MIT), Stanford Research Institute (SRI), and Texas Instruments (TI). The speech was recorded at TI, transcribed at MIT, and has been maintained, verified, and prepared for CD-ROM production by the National Institute of Standards and Technology (NIST). This file contains a brief description of the TIMIT Speech Corpus. Additional information including the referenced material and some relevant reprints of articles may be found in the printed documentation which is also available from NTIS (NTIS# PB91-100354).

Corpus Speaker Distribution

TIMIT contains a total of 6300 sentences, 10 sentences spoken by each of 630 speakers from 8 major dialect regions of the United States. Table 1 shows the number of speakers for the 8 dialect regions, broken down by sex. The percentages are given in parentheses. A speaker's dialect region is the geographical area of the U.S. where they lived during their childhood years. The geographical areas correspond with recognized dialect regions in U.S. (Language Files, Ohio State University Linguistics Dept., 1982), with the exception of the Western region (dr7) in which dialect boundaries are not known with any confidence and dialect region 8 where the speakers moved around a lot during their childhood.

   Table 1:  Dialect distribution of speakers

      Dialect
      Region(dr)    #Male    #Female    Total
      ----------  --------- ---------  ----------
         1         31 (63%)  18 (27%)   49 (8%)  
         2         71 (70%)  31 (30%)  102 (16%) 
         3         79 (67%)  23 (23%)  102 (16%) 
         4         69 (69%)  31 (31%)  100 (16%) 
         5         62 (63%)  36 (37%)   98 (16%) 
         6         30 (65%)  16 (35%)   46 (7%) 
         7         74 (74%)  26 (26%)  100 (16%) 
         8         22 (67%)  11 (33%)   33 (5%)
       ------     --------- ---------  ---------- 
         8        438 (70%) 192 (30%)  630 (100%)

The dialect regions are:
     dr1:  New England
     dr2:  Northern
     dr3:  North Midland
     dr4:  South Midland
     dr5:  Southern
     dr6:  New York City
     dr7:  Western
     dr8:  Army Brat (moved around)

Corpus Text Material

The text material in the TIMIT prompts (found in the file "prompts.doc") consists of 2 dialect "shibboleth" sentences designed at SRI, 450 phonetically-compact sentences designed at MIT, and 1890 phonetically-diverse sentences selected at TI. The dialect sentences (the SA sentences) were meant to expose the dialectal variants of the speakers and were read by all 630 speakers. The phonetically-compact sentences were designed to provide a good coverage of pairs of phones, with extra occurrences of phonetic contexts thought to be either difficult or of particular interest. Each speaker read 5 of these sentences (the SX sentences) and each text was spoken by 7 different speakers. The phonetically-diverse sentences (the SI sentences) were selected from existing text sources - the Brown Corpus (Kuchera and Francis, 1967) and the Playwrights Dialog (Hultzen, et al., 1964) - so as to add diversity in sentence types and phonetic contexts. The selection criteria maximized the variety of allophonic contexts found in the texts. Each speaker read 3 of these sentences, with each sentence being read only by a single speaker. Table 2 summarizes the speech material in TIMIT.

Table 2:  TIMIT speech material
  Sentence Type   #Sentences   #Speakers   Total   #Sentences/Speaker
  -------------   ----------   ---------   -----   ------------------
  Dialect (SA)          2         630       1260           2
  Compact (SX)        450           7       3150           5
  Diverse (SI)       1890           1       1890           3
  -------------   ----------   ---------   -----    ----------------
  Total              2342                   6300          10

Suggested Training/Test Subdivision

The speech material has been subdivided into portions for training and testing. The criteria for the subdivision is described in the file "testset.doc". THIS SUBDIVISION HAS NO RELATION TO THE DATA DISTRIBUTED ON THE PROTOTYPE VERSION OF THE CDROM.

Core Test Set:

The test data has a core portion containing 24 speakers, 2 male and 1 female from each dialect region. The core test speakers are shown in Table 3. Each speaker read a different set of SX sentences. Thus the core test material contains 192 sentences, 5 SX and 3 SI for each speaker, each having a distinct text prompt.

    Table 3:  The core test set of 24 speakers

     Dialect        Male      Female
     -------       ------     ------
        1        DAB0, WBT0    ELC0    
        2        TAS1, WEW0    PAS0    
        3        JMP0, LNT0    PKT0    
        4        LLL0, TLS0    JLM0    
        5        BPM0, KLT0    NLP0    
        6        CMJ0, JDH0    MGD0    
        7        GRT0, NJM0    DHC0
        8        JLN0, PAM0    MLD0

Complete Test Set:

A more extensive test set was obtained by including the sentences from all speakers that read any of the SX texts included in the core test set. In doing so, no sentence text appears in both the training and test sets. This complete test set contains a total of 168 speakers and 1344 utterances, accounting for about 27% of the total speech material. The resulting dialect distribution of the 168 speaker test set is given in Table 4. The complete test material contains 624 distinct texts.

     Table 4:  Dialect distribution for complete test set

      Dialect    #Male   #Female   Total
      -------    -----   -------   -----
        1           7        4       11
        2          18        8       26
        3          23        3       26
        4          16       16       32
        5          17       11       28
        6           8        3       11
        7          15        8       23
        8           8        3       11
      -----      -----   -------   ------
      Total       112       56      168

CDROM TIMIT Directory and File Structure

The speech and associated data is organized on the CD-ROM according to the following hierarchy:

////<SPEAKER_ID>/<SENTENCE_ID>.<FILE_TYPE>

 CORPUS :== timit
 USAGE :== train | test
 DIALECT :== dr1 | dr2 | dr3 | dr4 | dr5 | dr6 | dr7 | dr8 
             (see Table 1 for dialect code description)
 SEX :== m | f
 SPEAKER_ID :== &lt;INITIALS&gt;&lt;DIGIT&gt;

      where, 
      INITIALS :== speaker initials, 3 letters
      DIGIT :== number 0-9 to differentiate speakers with identical
                initials

 SENTENCE_ID :== &lt;TEXT_TYPE&gt;&lt;SENTENCE_NUMBER&gt;

      where,

      TEXT_TYPE :== sa | si | sx
                    (see Section 2 for sentence text type description)
      SENTENCE_NUMBER :== 1 ... 2342

 FILE_TYPE :== wav | txt | wrd | phn
               (see Table 5 for file type description)

Examples: /timit/train/dr1/fcjf0/sa1.wav

(TIMIT corpus, training set, dialect region 1, female speaker, speaker-ID "cjf0", sentence text "sa1", speech waveform file)

/timit/test/df5/mbpm0/sx407.phn

(TIMIT corpus, test set, dialect region 5, male speaker, speaker-ID "bpm0", sentence text "sx407", phonetic transcription file)

Online documentation and tables are located in the directory "timit/doc". A brief description of each file in this directory can be found in Section 6.

File Types

The TIMIT corpus includes several files associated with each utterance. In addition to a speech waveform file (.wav), three associated transcription files (.txt, .wrd, .phn) exist. These associated files have the form:

<BEGIN_SAMPLE> <END_SAMPLE> . . . <BEGIN_SAMPLE> <END_SAMPLE>

where,

                BEGIN_SAMPLE :== The beginning integer sample number for the 
                                 segment (Note: The first BEGIN_SAMPLE of each 
                                 file is always 0)

                END_SAMPLE :== The ending integer sample number for the segment
                               (Note: Because of the transcription method used,
                               the last END_SAMPLE in each transcription file 
                               may be less than the actual last sample in the
                               corresponding .wav file)

                TEXT :== &lt;ORTHOGRAPHY&gt; | &lt;WORD_LABEL&gt; | &lt;PHONETIC_LABEL&gt;

                where,

                     ORTHOGRAPHY :== Complete orthographic text transcription
                     WORD_LABEL :== Single word from the orthography
                     PHONETIC_LABEL :== Single phonetic transcription code
                                        (See "phoncode.doc" for description 
                                        of codes)
 Table 5:  Utterance-associated file types       
 File Type                     Description
 ---------  ------------------------------------------------------

     .wav - SPHERE-headered speech waveform file.  (See the "/sphere"
            directory for speech file manipulation utilities.)

     .txt - Associated orthographic transcription of the words the
            person said.  (Usually this is the same as the prompt, but 
            in a few cases the orthography and prompt disagree.)

     .wrd - Time-aligned word transcription. The word boundaries
            were aligned with the phonetic segments using a dynamic
            string alignment program (see the printed documentation
            section "Notes on the Word Alignments" and the lexical
            pronunciations given in "timitdic.txt".)

     .phn - Time-aligned phonetic transcription.  (See the reprint
            of the article by Seneff and Zue (1988), in the printed
            documentation, and the section "Notes on Checking the
            Phonetic Transcriptions" for more details on the phonetic
            transcription protocols.)
Example transcriptions from the utterance in "/timit/test/dr5/fnlp0/sa1.wav"

Orthography (.txt):
        0 61748 She had your dark suit in greasy wash water all year.

Word label (.wrd):
        7470 11362 she
        11362 16000 had
        15420 17503 your
        17503 23360 dark
        23360 28360 suit
        28360 30960 in
        30960 36971 greasy
        36971 42290 wash
        43120 47480 water
        49021 52184 all
        52184 58840 year

Phonetic label (.phn): 
(Note: beginning and ending silence regions are marked with h#)
        0 7470 h#
        7470 9840 sh
        9840 11362 iy
        11362 12908 hv
        12908 14760 ae
        14760 15420 dcl
        15420 16000 jh
        16000 17503 axr
        17503 18540 dcl
        18540 18950 d
        18950 21053 aa
        21053 22200 r
        22200 22740 kcl
        22740 23360 k
        23360 25315 s
        25315 27643 ux
        27643 28360 tcl
        28360 29272 q
        29272 29932 ih
        29932 30960 n
        30960 31870 gcl
        31870 32550 g
        32550 33253 r
        33253 34660 iy
        34660 35890 z
        35890 36971 iy
        36971 38391 w
        38391 40690 ao
        40690 42290 sh
        42290 43120 epi
        43120 43906 w
        43906 45480 ao
        45480 46040 dx
        46040 47480 axr
        47480 49021 q
        49021 51348 ao
        51348 52184 l
        52184 54147 y
        54147 56654 ih
        56654 58840 axr
        58840 61680 h#

Online Documentation

Compact documentation is located in the "/timit/doc" directory. Files in this directory with a ".doc" extension contain freeform descriptive text and files with a ".txt" extension contain tables of formatted text which can be searched programmatically. Lines in the ".txt" files beginning with a semicolon are comments and should be ignored on searches. The following is a brief description of their contents:

phoncode.doc - Table of phone symbols used in phonemic dictionary and 
               phonetic transcriptions
 prompts.txt - Table of sentence prompts and sentence-ID numbers
spkrinfo.txt - Table of speaker attributes
spkrsent.txt - Table of sentence-ID numbers for each speaker
 testset.doc - Description of suggested train/test subdivision
timitdic.doc - Description of phonemic lexicion
timitdic.txt - Phonemic dictionary of all orthographic words in prompts

A more extensive description of corpus design, collection, and transcription can be found in the printed documentation.

License: No license specified, the work may be protected by copyright.

More Repositories

1

keras-attention

Keras Attention Layer (Luong and Bahdanau scores).
Python
2,795
star
2

tensorflow-1.4-billion-password-analysis

Deep Learning model to analyze a large corpus of clear text passwords.
Python
1,875
star
3

keras-tcn

Keras Temporal Convolutional Network.
Python
1,798
star
4

yolo-9000

YOLO9000: Better, Faster, Stronger - Real-Time Object Detection. 9000 classes!
1,148
star
5

keract

Layers Outputs and Gradients in Keras. Made easy.
Python
1,032
star
6

deep-speaker

Deep Speaker: an End-to-End Neural Speaker Embedding System.
Python
864
star
7

n-beats

Keras/Pytorch implementation of N-BEATS: Neural basis expansion analysis for interpretable time series forecasting.
Python
810
star
8

name-dataset

The Python library for names.
Python
742
star
9

stanford-openie-python

Stanford Open Information Extraction made simple!
Python
605
star
10

deep-learning-bitcoin

Exploiting Bitcoin prices patterns with Deep Learning.
Python
516
star
11

FX-1-Minute-Data

HISTDATA - Dataset composed of all FX trading pairs / Crude Oil / Stock Indexes. Simple API to retrieve 1 Minute data Historical FX Prices (up to date).
Python
438
star
12

Deep-Learning-Tinder

Simple Tinder algorithm able to swipe left and right based on the recommendations of a pre-trained deep neural network (Machine Learning).
Python
274
star
13

cond_rnn

Conditional RNNs for Tensorflow / Keras.
Python
215
star
14

financial-news-dataset

Reuters and Bloomberg
211
star
15

my-first-bitcoin-miner

For the curious minds who want to understand how Bitcoin Blockchain works!
Python
185
star
16

expressvpn-python

ExpressVPN - Python Wrapper (IP auto switch).
Python
170
star
17

tensorflow-multi-dimensional-lstm

Multi dimensional LSTM as described in Alex Graves' Paper https://arxiv.org/pdf/0705.2011.pdf
Jupyter Notebook
155
star
18

tensorflow-class-activation-mapping

Learning Deep Features for Discriminative Localization (2016)
Python
151
star
19

easy-encryption

A very simple C++ module to encrypt/decrypt strings based on B64 and Vigenere ciper.
C++
138
star
20

Order-Book-Matching-Engine

Order Book Matching Engine for Stock Exchanges (1us latency for matching)
Java
135
star
21

tensorflow-phased-lstm

Phased LSTM: Accelerating Recurrent Network Training for Long or Event-based Sequences (NIPS 2016) - Tensorflow 1.0
Python
131
star
22

tensorflow-ctc-speech-recognition

Application of Connectionist Temporal Classification (CTC) for Speech Recognition (Tensorflow 1.0 but compatible with 2.0).
Python
131
star
23

fractional-differentiation-time-series

As described in Advances of Machine Learning by Marcos Prado.
Python
121
star
24

amazon-reviews-scraper

Yet another multi language scraper for Amazon targeting reviews.
Python
109
star
25

lead-lag

Estimation of the lead-lag parameter from non-synchronous data.
Jupyter Notebook
98
star
26

google-news-scraper

Google News Scraper for languages like Japanese, Chinese... [VPN Support]
Python
94
star
27

stock-volatility-google-trends

Deep Learning Stock Volatility with Google Domestic Trends: https://arxiv.org/pdf/1512.04916.pdf
Python
89
star
28

japanese-words-to-vectors

Word2vec (word to vectors) approach for Japanese language using Gensim and Mecab.
Python
83
star
29

mercari-python-api

The Python Mercari API.
Python
78
star
30

Stanford-NER-Python

Stanford Named Entity Recognizer (NER) - Python Wrapper
Python
74
star
31

very-deep-convnets-raw-waveforms

Tensorflow - Very Deep Convolutional Neural Networks For Raw Waveforms - https://arxiv.org/pdf/1610.00087.pdf
Python
74
star
32

speaker-change-detection

Paper: https://arxiv.org/abs/1702.02285
Python
62
star
33

tensorflow-maxout

Maxout Networks TensorFlow implementation presented in https://arxiv.org/abs/1302.4389
Python
57
star
34

tensorflow-cnn-time-series

Feeding images of time series to Conv Nets! (Tensorflow + Keras)
Python
50
star
35

keras-seq2seq-example

Toy Keras implementation of a seq2seq model with examples.
Python
49
star
36

tensorflow-fifo-queue-example

Example on how to use a Tensorflow Queue to feed data to your models.
Python
39
star
37

3.7-billion-passwords-tools

Tools to manipulate the data behind Collection #1 (and #2–5) - AntiPublic.
Python
38
star
38

python-darknet-yolo-v4

Python to interface with Darknet Yolo V4 (multi GPU with load balancer supported).
Python
37
star
39

bitmex-liquidations

Minimal code to show how to receive the liquidations in realtime on Bitmex.
Python
33
star
40

Statistical-Arbitrage

Using Particle Markov Chain Monte Carlo
MATLAB
33
star
41

tensorflow-grid-lstm

Implementation of the paper https://arxiv.org/pdf/1507.01526v3.pdf (Tensorflow 1.0, Python 3)
Python
29
star
42

advanced-deep-learning-keras

File repository for the course [Advanced Deep Learning with Keras]. Packt Publishing.
Jupyter Notebook
28
star
43

vision-api

Google Vision API made easy!
Python
26
star
44

Facebook-Profile-Pictures-Downloader

😆 Download public profile pictures from Facebook.
Python
25
star
45

bitcoin-market-data

Largest tick market data for Bitcoin (mirror server of bitcoincharts.com).
Shell
24
star
46

NiceHash-api-monitoring-client

Simple NiceHash client to monitor your mining rigs. Configure alerts and emails!
Python
22
star
47

information-extraction-with-dominating-rules

Information extraction based on Stanford open IE Library and domination decision rules. http://philipperemy.github.io/information-extract/
Python
22
star
48

beer-dataset

The biggest beer database is in this repo!
Python
21
star
49

Market-Data

Module to retrieve realtime stock quotes of Paris stock exchange
Java
20
star
50

instant-music-playlist-downloader

Download MP3 songs from the web.
Python
20
star
51

Sentiment-Analysis-NLP

Sentiment Analysis applied to different datasets such as IMDB
Python
19
star
52

wavenet

A general TensorFlow implementation of the Wavenet network to be used to model long term sequences with less trainable parameters.
Python
18
star
53

keras-snail-attention

SNAIL Attention Block for Keras.
Python
17
star
54

which-of-your-friends-are-on-tinder

Discover which of your Facebook friends are on Tinder!
Python
16
star
55

LSTM-text-generation

Generating NEW Reuters articles from Reuters articles.
Python
16
star
56

keras-frn

Keras Filter Response Normalization Layer.
Python
15
star
57

keras-sde-net

Keras implementation of SDE-Net (ICML 2020).
Python
14
star
58

Candlestick-Chart-Generator

Candlestick Charts in JavaScript.
JavaScript
14
star
59

Peer-Group-Analysis-Clustering

Unsupervised Clustering of Time Series using Peer Group Analysis PGA
MATLAB
14
star
60

python-pubsub

A simple python implementation of a message router with many subscribers and many publishers.
Python
13
star
61

selenium-python-examples

Selenium examples in Python (web scraper).
Python
11
star
62

OrderBook-TWAP

Programming Test
C++
11
star
63

philipperemy.github.io

My blog.
SCSS
11
star
64

fxrt

Realtime FX prices from the Oanda broker.
Python
10
star
65

tensorflow-isan-rnn

Input Switched Affine Networks: An RNN Architecture Designed for Interpretability. http://proceedings.mlr.press/v70/foerster17a/foerster17a.pdf
Python
10
star
66

twitter-arxiv-sanity

Your daily "top hype" papers.
Python
9
star
67

japan-weather-forecast

Japanese Meteorological Agency (scraper + data)
Python
9
star
68

github-backup

Back up all your Github repositories in a directory.
Python
9
star
69

Leboncoin

Management of small ads (editing, publishing, deleting, re-publishing)
Java
9
star
70

Github-full-data-set

Generating GitHub data (~1M repositories May 2017).
Python
8
star
71

cocktails

Generate the best cocktail ever with Machine Learning !
Python
8
star
72

Technical-Analysis

Technical Analysis Tool based on TA Lib
C
8
star
73

Ransac-Java

Implementation of the Ransac algorithm written in Java.
Java
8
star
74

Data-Mining-Automaton

Quantitative Algobox based on Data Mining techniques
Java
8
star
75

GPU-Activity-Monitoring

Python monitoring tool for the nvidia-smi command on Linux.
Python
7
star
76

digital-setting-circles

Compatible with Raspberry Pi. Setting circles are used on telescopes equipped with an equatorial mount to find astronomical objects in the sky by their equatorial coordinates.
C++
7
star
77

japanese-street-addresses-scraper

Scraper for Japanese street addresses (住所).
Python
7
star
78

bitstamp-realtime-order-book

Gives you low latency access to Bitstamp Realtime Order Book.
Python
7
star
79

urban-dictionary-transformers

Transformers applied to Urban Dictionary for fun.
Python
7
star
80

arma-scipy-fit

Estimating coefficients of ARMA models with the Scipy package.
Python
7
star
81

binance-futures

Straightforward API endpoint to receive market data for Binance Futures.
Python
6
star
82

HFT-FIX-Parser

Ultra low latency FIX Parser
6
star
83

bitflyer

Bitflyer API Realtime Feed Python.
Python
6
star
84

Ogame-API

Ogame API
Java
6
star
85

keras-mode-normalization

Keras Implementation of Mode Normalization (Lucas Deecke, Iain Murray, Hakan Bilen, 2018)
Python
6
star
86

Kaggle-PKDD-Taxi-I

https://www.kaggle.com/c/pkdd-15-predict-taxi-service-trajectory-i
Python
6
star
87

ssh-failed-attempts

Tool to detect and analyze failed SSH attempts.
Python
5
star
88

Idealwine-wine-prices

API to retrieve quotes from daily wine auctions
Java
5
star
89

notifier

Receive notifications on your phone when your CLI tasks finish.
Python
5
star
90

API-Ratp

API to retrieve real time schedule times for Paris transports
5
star
91

tf-easy-model-saving

An easy way to load and save checkpoints in Tensorflow!
Python
5
star
92

Visual-Ballistic-Roulette-python

Visual Ballistic Roulette written in Python.
Python
5
star
93

Quantitative-Market-Data-Generator

Equity Prices Generator using Quantitative methods such as Brownian Motion
4
star
94

Monte-Carlo-Pi-Computation

This projects aims at computing PI using Monte Carlo method
4
star
95

japanese-sentences-to-vectors

Sentences2vec (sentences to vectors or s2v) algorithm using different papers such as skip-thoughts vectors.
Python
4
star
96

Visual-Ballistic-Roulette-Timer-Android

Timer for Roulette written for Android
Java
4
star
97

EGAIN-pytorch

Python
4
star
98

record-your-internet-speed

Record your internet speed at fixed intervals.
Python
4
star
99

Visual-Ballistic-Roulette-Display-Android

Android App
Java
4
star
100

Martingale-Roulette-MonteCarlo

Monte Carlo simulations for Casino Roulette
MATLAB
4
star