VikParuchuri/scribe

This repository has been archived on 14/Mar/2023
Stars
123
Rank 290,145 (Top 6 %)
Language
Python
Created over 10 years ago
Updated almost 8 years ago

VikParuchuri/scribe

VikParuchuri

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Simple speech recognition using your microphone.

Scribe

Simple speech recognition for Python. Run the script, say some things into your microphone, and then see what you said (or an approximation).

Powered by pyaudio and Sphinx.

Installation

Sphinxbase

Download sphinxbase and extract the files.

Now, run:

cd sphinxbase
./configure;make clean all;make install
cd python
python setup.py install

You may need to use sudo for make install or python setup.py install.

Pocketsphinx

Download pocketsphinx and extract the files.

Now, run:

cd pocketsphinx
./configure;make clean all;make install
cd python
python setup.py install

Packages (Linux only)

Now, run:

cd speech-recognizer
sudo xargs -a apt-packages.txt apt-get install

Pyaudio

Now, download the right version of pyaudio and install it.

Language files

If you want to speak english, you need to get the english language model and the english acoustic model.

You will need to put the acoustic model into scribe/hmm, and the language model into scribe/lm.

The filetree should look like this for english:

scribe
├── dict
│   └── cmu07a.dic
├── hmm
│   ├── feat.params
│   ├── feature_transform
│   ├── mdef
│   ├── means
│   ├── mixture_weights
│   ├── noisedict
│   ├── README
│   ├── transition_matrices
│   └── variances
├── lm
│   └── cmusphinx-5.0-en-us.lm.dmp

For other languages, check here, or see below on training your own model. If you use different language models, acoustic models, or dictionaries, you will want to change these paths in recognizer.py:

HMDIR = os.path.join(BASE_PATH, "hmm")
LMDIR = os.path.join(BASE_PATH, "lm/cmusphinx-5.0-en-us.lm.dmp")
DICTD = os.path.join(BASE_PATH, "dict/cmu07a.dic")

Run

To run, you just have to:

cd speech-recognizer
python recognizer.py

You should be able to talk for a few seconds, after which it will spend some time processing, and the show you what you said.

Configure

There are some options that you can modify at the top of recognizer.py. The easiest one to modify is RECORD_SECONDS.

More reading

To find out more, read up on sphinx.

You can train the language models to make them more accurate, use unsupported languages, or be more domain-specific.

marker

Convert PDF to markdown quickly with high accuracy

surya

OCR, layout analysis, reading order, line detection in 90+ languages

apartment-finder

A Slack bot that helps you find an apartment.

zero_to_gpt

Go from no deep learning knowledge to implementing GPT.

Jupyter Notebook

texify

Math OCR model that outputs LaTeX and markdown

textbook_quality

Generate textbook-quality synthetic LLM pretraining data

pdftext

Extract structured text from pdfs quickly

libgen_to_txt

Convert all of libgen to high quality markdown

researcher

Concise answers to search queries using Google and GPT-3. Includes citations.

scan

Score essays automatically with an easy web interface.

evolve-music2

Evolve music automatically with python -- rewrite of evolve-music.

classified

Score LLM pretraining data with classifiers

evolve-music

Superseded by github.com/vikparuchuri/evolve-music2 -- use that instead.

simpsons-scripts

Find out how much the simpsons characters like each other with text and audio analysis.

movide

The student-centric learning platform.

snapcheck

Find out if your info was leaked.

political-positions

Analyze politics.

vikparuchuri.com

Code for vikparuchuri.com -- personal blog.

boston-python-ml

Text scoring/classification presentation

percept

A modular machine learning framework that is easy to test and deploy.

wp-deployment

Deploy wordpress with multisite to ec2 with ansible.

spotify-export

Export albums from Spotify into Google Play Music.

pdf_to_md

algorithms

Pure python implementations of various algorithms, including a matrix class.

triton_tutorial

Tutorials for Triton, a language for writing gpu kernels

Jupyter Notebook

vikparuchuri-affirm

ds-webinar

How to learn data science webinar presentation

nyt-articles

Get articles from new york times API.

ml-math

TulaLensSurvey

Android app that makes it easy to survey people.

medicare-analysis

Analyze medicare data from the recent release.

sports-stats

Try to rethink sports statistics.

bostonpython2015

Presentation for boston python 2015

dscontent-starter

Presentations

vik-blog

tulalens-survey-web

Web component of android survey app.

nextml-talk

vj-wedding2

A site I made for a wedding.

matter

Chrome extension that highlights important passages.

vj-wedding

Placeholder site for a wedding (with countdown)

affirm-themes

Themes for affirm.io.

openphi