• This repository has been archived on 14/Mar/2023
  • Stars
    star
    123
  • Rank 290,145 (Top 6 %)
  • Language
    Python
  • Created over 10 years ago
  • Updated almost 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Simple speech recognition using your microphone.

Scribe

Simple speech recognition for Python. Run the script, say some things into your microphone, and then see what you said (or an approximation).

Powered by pyaudio and Sphinx.

Installation

Sphinxbase

Download sphinxbase and extract the files.

Now, run:

cd sphinxbase
./configure;make clean all;make install
cd python
python setup.py install

You may need to use sudo for make install or python setup.py install.

Pocketsphinx

Download pocketsphinx and extract the files.

Now, run:

cd pocketsphinx
./configure;make clean all;make install
cd python
python setup.py install

Packages (Linux only)

Now, run:

cd speech-recognizer
sudo xargs -a apt-packages.txt apt-get install

Pyaudio

Now, download the right version of pyaudio and install it.

Language files

If you want to speak english, you need to get the english language model and the english acoustic model.

You will need to put the acoustic model into scribe/hmm, and the language model into scribe/lm.

The filetree should look like this for english:

scribe
β”œβ”€β”€ dict
β”‚Β Β  └── cmu07a.dic
β”œβ”€β”€ hmm
β”‚Β Β  β”œβ”€β”€ feat.params
β”‚Β Β  β”œβ”€β”€ feature_transform
β”‚Β Β  β”œβ”€β”€ mdef
β”‚Β Β  β”œβ”€β”€ means
β”‚Β Β  β”œβ”€β”€ mixture_weights
β”‚Β Β  β”œβ”€β”€ noisedict
β”‚Β Β  β”œβ”€β”€ README
β”‚Β Β  β”œβ”€β”€ transition_matrices
β”‚Β Β  └── variances
β”œβ”€β”€ lm
β”‚Β Β  └── cmusphinx-5.0-en-us.lm.dmp

For other languages, check here, or see below on training your own model. If you use different language models, acoustic models, or dictionaries, you will want to change these paths in recognizer.py:

HMDIR = os.path.join(BASE_PATH, "hmm")
LMDIR = os.path.join(BASE_PATH, "lm/cmusphinx-5.0-en-us.lm.dmp")
DICTD = os.path.join(BASE_PATH, "dict/cmu07a.dic")

Run

To run, you just have to:

cd speech-recognizer
python recognizer.py

You should be able to talk for a few seconds, after which it will spend some time processing, and the show you what you said.

Configure

There are some options that you can modify at the top of recognizer.py. The easiest one to modify is RECORD_SECONDS.

More reading

To find out more, read up on sphinx.

You can train the language models to make them more accurate, use unsupported languages, or be more domain-specific.

More Repositories

1

marker

Convert PDF to markdown quickly with high accuracy
Python
15,391
star
2

surya

OCR, layout analysis, reading order, line detection in 90+ languages
Python
9,453
star
3

apartment-finder

A Slack bot that helps you find an apartment.
Python
1,061
star
4

zero_to_gpt

Go from no deep learning knowledge to implementing GPT.
Jupyter Notebook
940
star
5

texify

Math OCR model that outputs LaTeX and markdown
Python
673
star
6

textbook_quality

Generate textbook-quality synthetic LLM pretraining data
Python
467
star
7

pdftext

Extract structured text from pdfs quickly
Python
261
star
8

libgen_to_txt

Convert all of libgen to high quality markdown
Python
235
star
9

researcher

Concise answers to search queries using Google and GPT-3. Includes citations.
Python
72
star
10

scan

Score essays automatically with an easy web interface.
Python
41
star
11

evolve-music2

Evolve music automatically with python -- rewrite of evolve-music.
Python
40
star
12

classified

Score LLM pretraining data with classifiers
Python
38
star
13

evolve-music

Superseded by github.com/vikparuchuri/evolve-music2 -- use that instead.
C
25
star
14

simpsons-scripts

Find out how much the simpsons characters like each other with text and audio analysis.
Python
24
star
15

movide

The student-centric learning platform.
Python
18
star
16

snapcheck

Find out if your info was leaked.
Python
15
star
17

political-positions

Analyze politics.
Python
14
star
18

vikparuchuri.com

Code for vikparuchuri.com -- personal blog.
Ruby
13
star
19

boston-python-ml

Text scoring/classification presentation
JavaScript
9
star
20

percept

A modular machine learning framework that is easy to test and deploy.
Python
9
star
21

wp-deployment

Deploy wordpress with multisite to ec2 with ansible.
Python
7
star
22

spotify-export

Export albums from Spotify into Google Play Music.
Python
7
star
23

pdf_to_md

Python
6
star
24

algorithms

Pure python implementations of various algorithms, including a matrix class.
Python
6
star
25

triton_tutorial

Tutorials for Triton, a language for writing gpu kernels
Jupyter Notebook
5
star
26

vikparuchuri-affirm

CSS
5
star
27

ds-webinar

How to learn data science webinar presentation
CSS
5
star
28

nyt-articles

Get articles from new york times API.
Python
5
star
29

ml-math

Svelte
3
star
30

TulaLensSurvey

Android app that makes it easy to survey people.
Java
3
star
31

medicare-analysis

Analyze medicare data from the recent release.
CSS
3
star
32

sports-stats

Try to rethink sports statistics.
Python
3
star
33

bostonpython2015

Presentation for boston python 2015
CSS
2
star
34

dscontent-starter

2
star
35

Presentations

JavaScript
1
star
36

vik-blog

HTML
1
star
37

tulalens-survey-web

Web component of android survey app.
Ruby
1
star
38

nextml-talk

CSS
1
star
39

vj-wedding2

A site I made for a wedding.
JavaScript
1
star
40

matter

Chrome extension that highlights important passages.
JavaScript
1
star
41

vj-wedding

Placeholder site for a wedding (with countdown)
JavaScript
1
star
42

affirm-themes

Themes for affirm.io.
CSS
1
star
43

openphi

1
star