• Stars
    star
    2,166
  • Rank 21,290 (Top 0.5 %)
  • Language
    Python
  • License
    Other
  • Created almost 9 years ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

🎙Speech recognition using the tensorflow deep learning framework, sequence-to-sequence neural networks

Tensorflow Speech Recognition

Speech recognition using google's tensorflow deep learning framework, sequence-to-sequence neural networks.

Replaces caffe-speech-recognition, see there for some background.

Update Mozilla released DeepSpeech

They achieve good error rates. Free Speech is in good hands, go there if you are an end user. For now this project is only maintained for educational purposes.

Ultimate goal

Create a decent standalone speech recognition for Linux etc. Some people say we have the models but not enough training data. We disagree: There is plenty of training data (100GB here and 21GB here on openslr.org , synthetic Text to Speech snippets, Movies with transcripts, Gutenberg, YouTube with captions etc etc) we just need a simple yet powerful model. It's only a question of time...

Sample spectrogram, That's what she said, too laid?

Sample spectrogram, Karen uttering 'zero' with 160 words per minute.

Installation

clone code

git clone https://github.com/pannous/tensorflow-speech-recognition
cd tensorflow-speech-recognition
git clone https://github.com/pannous/layer.git
git clone https://github.com/pannous/tensorpeers.git

pyaudio

requirements portaudio from http://www.portaudio.com/

git clone  https://git.assembla.com/portaudio.git
./configure --prefix=/path/to/your/local
make
make install
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/path/to/your/local/lib
export LIDRARY_PATH=$LIBRARY_PATH:/path/to/your/local/lib
export CPATH=$CPATH:/path/to/your/local/include
source ~/.bashrc

install pyaudio

pip install pyaudio

Getting started

Toy examples: ./number_classifier_tflearn.py ./speaker_classifier_tflearn.py

Some less trivial architectures: ./densenet_layer.py

Later: ./train.sh ./record.py

Sample spectrogram or record.py

Update: Nervana demonstrated that it is possible for 'independents' to build speech recognizers that are state of the art.

Fun tasks for newcomers

Extensions

Extensions to current tensorflow which are probably needed:

Even though this project is far from finished we hope it gives you some starting points.

Looking for a tensorflow collaboration / consultant / deep learning contractor? Reach out to [email protected]

More Repositories

1

tensorflow-ocr

🖺 OCR using tensorflow with attention
Python
647
star
2

caffe-speech-recognition

Speech Recognition with the Caffe deep learning framework, migrating to
Jupyter Notebook
325
star
3

caffe-ocr

OCR with caffe deep learning framework -> Migrated to tensorflow
Shell
215
star
4

english-script

🖊 English as a programming language
Ruby
164
star
5

angle

⦠ Angle: new speakable syntax for python 💡
Python
129
star
6

wasp

🐝 Wasp : Wasm programming language
C++
112
star
7

tensorpeers

p2p peer-to-peer training of tensorflow models
Python
64
star
8

xipher

🔒 Simple perfect xor encryption cipher 🔒
C
62
star
9

jini-plugin

Copilot X like features for Jetbrains IDEs using ChatGpt and GPT-4
Java
38
star
10

hieros

Egyptian hieroglyps and Eurasian languages
HTML
23
star
11

Diffie-Hellman

Standalone Java reference implementation of Diffie Hellman
Java
21
star
12

jeannie-webclient

The famous Jeannie assistant now living inside of your browser, including emails, calls etc! #siri
JavaScript
21
star
13

Voice-Actions-API

Public API hook for Voice Actions Plus / Jeannie
Java
10
star
14

swadesh

Collection of swadesh lists in CSV table format with possible connections to Indo European
Roff
8
star
15

netbase

🌐 Netbase : Semantic Graph Database & Wikidata Server
C++
8
star
16

wasm

You are looking for webassembly
8
star
17

tensor-caffe

TensorFlow graph importer from caffe protobuf / prototxt
Protocol Buffer
7
star
18

karpathy_neuralnets_python

Python code accompanying Andrej Karpathy's [great] Hacker's guide to Neural Networks
Python
6
star
19

layer

tensorflow custom comfort wrapper
Python
6
star
20

Voice2Web

JavaScript
4
star
21

node-netbase

node.js module for netbase: a semantic Graph Database with wordnet, wikidata, freebase, csv, xml, ... importer
JavaScript
4
star
22

kast

Canonical AST, the only Abstract Syntax Tree you need, with importers+exporters to all languages
Python
4
star
23

shapenet

Deep Learning network learning geometric shapes
Python
3
star
24

english-script-samples

English script samples for the angle programming language
E
2
star
25

blueprints-netbase

blueprints driver for the netbase graph database
Java
2
star
26

speech-data

2
star
27

angle.js

javascript version of the angle programming language
JavaScript
2
star
28

concentration-camps

Close Korean concentration camps
1
star
29

angle.ts

Angle programming language - Typescript implementation and bindings
TypeScript
1
star
30

lyri

Lyri: a Siri-like assistant for your Terminal command line (Linux and Mac OS/X)
Swift
1
star
31

test-lld-wasm

Minimal example of how to merge/concat/link/combine two wasm files.
Shell
1
star
32

netbase-ruby

ruby adapter for netbase
Ruby
1
star