• Stars
    star
    165
  • Rank 220,667 (Top 5 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created almost 7 years ago
  • Updated over 6 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Speech Recognition Using Tacotron

Speech Recognition Using Tacotron

Motivation

Tacotron is an end-to-end speech generation model which was first introduced in Towards End-to-End Speech Synthesis. It takes as input text at the character level, and targets mel filterbanks and the linear spectrogram. Although it is a generation model, I felt like testing how well it can be applied to the speech recognition task.

Requirements

  • NumPy >= 1.11.1
  • TensorFlow == 1.1
  • librosa

Model description

Tacotronβ€”Speech Synthesis Model (From_ Figure 1 in Towards End-to-End Speech Synthesis)

Modified architecture for speech recognition

Data

The World English Bible is a public domain update of the American Standard Version of 1901 into modern English. Its text and audio recordings are freely available here. Unfortunately, however, each of the audio files matches a chapter, not a verse, so is too long for many machine learning tasks. I had someone slice them by verse manually. You can download the audio data and its text from my dropbox.

File description

  • hyperparams.py includes all hyper parameters.
  • prepro.py creates training and evaluation data to data/ folder.
  • data_load.py loads data and put them in queues so multiple mini-bach data are generated in parallel.
  • utils.py has some operational functions.
  • modules.py contains building blocks for encoding and decoding networks.
  • networks.py defines encoding and decoding networks.
  • train.py executes training.
  • eval.py executes evaluation.

Training

  • STEP 1. Adjust hyper parameters in hyperparams.py if necessary.
  • STEP 2. Download and extract the audio data and its text.
  • STEP 3. Run train.py. Or you can download my pretrained file

Evaluation

  • Run eval.py to get speech recognition results for the test set.

Results

The training curve looks like

Sample results are

Expected: the third poured out his bowl into the rivers and springs of water and they became blood
Got : the first will lie down to the rivers and springs of waters and it became blood

Expected: i heard the altar saying yes lord god the almighty true and righteous are your judgments
Got : i heard the altar saying yes were like your own like you tree in righteousness for your judgments

Expected: the fourth poured out his bowl on the sun and it was given to him to scorch men with fire
Got : the foolish very armed were on the sun and was given to him to spoke to him with fire

Expected: he gathered them together into the place which is called in hebrew megiddo
Got : he gathered them together into the place which is called and he weep and at every

Expected: every island fled away and the mountains were not found
Got : hadad and kedemoth aroen and another and spread out them

Expected: here is the mind that has wisdom the seven heads are seven mountains on which the woman sits
Got : he is the mighty have wisdom the seven heads of seven rountains are with the wind sixter

Expected: these have one mind and they give their power and authority to the beast
Got : these are those who are mine and they give holl of a fool in the deeps

Expected: the woman whom you saw is the great city which reigns over the kings of the earth
Got : the woman whom he saw it his degrection which ran and to advening to be ear

Expected: for her sins have reached to the sky and god has remembered her iniquities
Got : for he sends a least in the sky and god has remembered her iniquities

Expected: the merchants of the earth weep and mourn over her for no one buys their merchandise any more
Got : the mittites of the earth weeps in your own are before from knowing babylon busine backsliding all t

Expected: and cried out as they looked at the smoke of her burning saying 'what is like the great city'
Got : and cried all the wicked beside of a good one and saying when is like the great sight

Expected: in her was found the blood of prophets and of saints and of all who have been slain on the earth
Got : and her with stones a dwellified confidence and all who have been slain on the earth

Expected: a second said hallelujah her smoke goes up forever and ever
Got : as set him said how many men utter for smoke go down for every male it

Expected: he is clothed in a garment sprinkled with blood his name is called the word of god
Got : he is close in a garment speaking in the blood his name is called 'the word of god'

Expected: the armies which are in heaven followed him on white horses clothed in white pure fine linen
Got : the army which are in heaven falls on the mighty one horses clothes driven on the affliction

Expected: he has on his garment and on his thigh a name written king of kings and lord of lords
Got : he has understandings on his folly among widow the king of kings and yahweh of armies

Expected: i saw an angel coming down out of heaven having the key of the abyss and a great chain in his hand
Got : i saw an even become young lion having you trust of the ages and a great chamber is hand

Expected: and after the thousand years satan will be released from his prison
Got : and after the palace and mizpah and eleven eleenth were the twentieth

Expected: death and hades were thrown into the lake of fire this is the second death the lake of fire
Got : let them hate with one and to wait for fire this is the second death and lead a time

Expected: if anyone was not found written in the book of life he was cast into the lake of fire
Got : the ten man will not think within your demon as with a blood he will cast him to ram for fire

Expected: he who overcomes i will give him these things i will be his god and he will be my son
Got : he who recompenses i will give him be stings i will be his god and he will be my son

Expected: its wall is one hundred fortyfour cubits by the measure of a man that is of an angel
Got : is through all his womb home before you for accusation that we may know him by these are in egypt

Expected: the construction of its wall was jasper the city was pure gold like pure glass
Got : if he struck him of his wallor is not speaking with torment hold on her grass

Expected: i saw no temple in it for the lord god the almighty and the lamb are its temple
Got : i saw in a tenth wind for we will dry up you among the linen ox skillful

Expected: its gates will in no way be shut by day for there will be no night there
Got : his greech wind more redeems shameful the redeemer man don't know

Expected: and they shall bring the glory and the honor of the nations into it so that they may enter
Got : and they shall bring the glory in the high mountains and the egyptian into the midst of the needy

Expected: they will see his face and his name will be on their foreheads
Got : they will see his face and his name on their follows

Expected: behold i come quickly blessed is he who keeps the words of the prophecy of this book
Got : behold i happened with me when i could see me to still it is a prophet his bueld

Expected: he said to me don't seal up the words of the prophecy of this book for the time is at hand
Got : he said to him why sil with the words of the prophets it is book for the times and her

Expected: behold i come quickly my reward is with me to repay to each man according to his work
Got : behold i come perfect i yahweh is with me to repent to be shamed according to his work

Expected: i am the alpha and the omega the first and the last the beginning and the end
Got : i have you hope from you and you and the first from aloes of the dew and the enemy

Expected: he who testifies these things says yes i come quickly amen yes come lord jesus
Got : he who testifies these things says yes i come proclaim i man listen will jesus

Related projects

More Repositories

1

transformer

A TensorFlow Implementation of the Transformer: Attention Is All You Need
Python
4,126
star
2

nlp_tasks

Natural Language Processing Tasks and References
3,018
star
3

wordvectors

Pre-trained word vectors of 30+ languages
Python
2,199
star
4

tacotron

A TensorFlow Implementation of Tacotron: A Fully End-to-End Text-To-Speech Synthesis Model
Python
1,818
star
5

numpy_exercises

Numpy exercises.
Python
1,672
star
6

dc_tts

A TensorFlow Implementation of DC-TTS: yet another text-to-speech model
Python
1,147
star
7

sudoku

Can Neural Networks Crack Sudoku?
Python
821
star
8

g2p

g2p: English Grapheme To Phoneme Conversion
Python
734
star
9

tensorflow-exercises

TensorFlow Exercises - focusing on the comparison with NumPy.
Python
535
star
10

deepvoice3

Tensorflow Implementation of Deep Voice 3
Python
452
star
11

css10

CSS10: A Collection of Single Speaker Speech Datasets for 10 Languages
HTML
440
star
12

neural_chinese_transliterator

Can CNNs transliterate Pinyin into Chinese characters correctly?
Python
330
star
13

pytorch_exercises

Jupyter Notebook
312
star
14

bert_ner

Ner with Bert
Python
278
star
15

word_prediction

Word Prediction using Convolutional Neural Networks
Python
251
star
16

nlp_made_easy

Explains nlp building blocks in a simple manner.
Jupyter Notebook
247
star
17

g2pC

g2pC: A Context-aware Grapheme-to-Phoneme Conversion module for Chinese
Python
231
star
18

g2pK

g2pK: g2p module for Korean
Python
216
star
19

expressive_tacotron

Tensorflow Implementation of Expressive Tacotron
Python
196
star
20

speaker_adapted_tts

Making a TTS model with 1 minute of speech samples within 10 minutes
184
star
21

neural_japanese_transliterator

Can neural networks transliterate Romaji into Japanese correctly?
Python
173
star
22

quasi-rnn

Character-level Neural Translation using Quasi-RNNs
Python
134
star
23

label_smoothing

Corrupted labels and label smoothing
Jupyter Notebook
127
star
24

bert-token-embeddings

Jupyter Notebook
97
star
25

mtp

Multi-lingual Text Processing
95
star
26

cross_vc

Cross-lingual Voice Conversion
Python
94
star
27

name2nat

name2nat: a Python package for nationality prediction from a name
Python
89
star
28

pron_dictionaries

pronunciation dictionaries for multiple languages
Python
79
star
29

msg_reply

a simple message reply suggestion system
Python
78
star
30

word_ordering

Can neural networks order a scramble of words correctly?
Python
74
star
31

kss

Python
70
star
32

neural_tokenizer

Tokenize English sentences using neural networks.
Python
64
star
33

bytenet_translation

A TensorFlow Implementation of Machine Translation In Neural Machine Translation in Linear Time
Python
60
star
34

KoParadigm

KoParadigm: Korean Inflectional Paradigm Generator
Python
54
star
35

specAugment

Tensor2tensor experiment with SpecAugment
Python
46
star
36

vq-vae

A Tensorflow Implementation of VQ-VAE Speaker Conversion
Python
43
star
37

lm_finetuning

Language Model Fine-tuning for Moby Dick
Python
42
star
38

texture_generation

An Implementation of 'Texture Synthesis Using Convolutional Neural Networks' with Kylberg Texture Dataset
Python
33
star
39

integer_sequence_learning

RNN Approaches to Integer Sequence Learning--the famous Kaggle competition
Python
27
star
40

cjk_trans

Pre-trained Machine Translation Models of Korean from/to ECJ
27
star
41

h2h_converter

Convert Sino-Korean words written in Hangul to Chinese characters, which is called hanja in Korean, using neural networks
Python
25
star
42

up_and_running_with_Tensorflow

A simple tutorial of TensorFlow + TensorFlow / NumPy exercises
Jupyter Notebook
13
star
43

neurobind

Yet Another Model Using Neural Networks for Predicting Binding Preferences of for Test DNA Sequences
Python
11
star
44

kollocate

Collocation Search of Korean
Python
9
star
45

kyubyong

9
star
46

WhereAmI

Where Am I? - If you want to meet me.
5
star
47

spam_detection

Spam Dectection Under Semi-supervised settings
5
star
48

helo_word

A Neural Grammatical Error Correction System Built On Better Pre-training and Sequential Transfer Learning
Python
2
star