• Stars
    star
    456
  • Rank 95,985 (Top 2 %)
  • Language
  • Created almost 11 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

πŸ’¬ Reverse Engineering Google's Speech To Text API (v2)

Google Speech API v2:

NOTICE

Google has since launched it's official Google Cloud Speech API. I strongly recommend looking over there.

Host:

https://www.google.com/speech-api/v2/recognize

Parameters

output: json, xml not supported.

lang: any valid locale (en-us, nl-be, fr-fr, etc.)

key: Please get one from the Google Developers Console

Key is not optional.

app: optional

You can specify an optional query string called app, which returns some extra transcripts for some reason.

client: optional, seems to do nothing in particular

Data:

FLAC

Flac file; 44100Hz 32bit float, exported with Audacity. Check the audio folder in this repository for some hilarious examples.

Channels       : 2
Sample Rate    : 44100
Precision      : 32-bit
Sample Encoding: 32-bit Float

16-bit PCM

The following audio options are confirmed working for 16-bit PCM sample encoding:

Channels       : 1
Sample Rate    : 16000
Precision      : 16-bit
Sample Encoding: 16-bit Signed Integer PCM

One-line sox recording command:

rec --encoding signed-integer --bits 16 --channels 1 --rate 16000 test.wav

Headers:

Content-Type:

Content-Type: audio/x-flac; rate=44100;

Set the rate to be equal to the rate of the FLAC file (generally 44100Hz) but it supports different rates.

Content-Type: audio/l16; rate=16000; is also supported with a rate of 44100Hz or 16000Hz for files encoded with LPCM 16-bit signed-integer.

NOTE: Make sure the rate in your header matches the sample rate you used for your audio capture.

User-Agent:

not required, but for spoofing purposes use one of Chrome’s userAgent strings.

Response:

When Google is 100% confident in it's translation, it will return the following object:

{
   "result":[
      {
         "alternative":[
            {
               "transcript":"good morning Google how are you feeling today"
            }
         ],
         "final":true
      }
   ],
   "result_index":0
}

When it's doubtful, it adds a confidence parameter for you. It also seems to add multiple transcripts for some reason.

{
  "result":[
    {
      "alternative":[
        {
          "transcript":"this is a test",
          "confidence":0.97321892
        },
        {
          "transcript":"this is a test for"
        }
      ],
      "final":true
    }
  ],
  "result_index":0
}

Example

Install sox

On OS X with Homebrew installed:

brew install sox

Record audio

rec --encoding signed-integer --bits 16 --channels 1 --rate 16000 test.wav

Send the request

curl -X POST \
--data-binary @'audio/hello (16bit PCM).wav' \
--header 'Content-Type: audio/l16; rate=16000;' \
'https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&key=yourkey'

Or for FLAC encoded audio:

curl -X POST \
--data-binary @audio/good-morning-google.flac \
--header 'Content-Type: audio/x-flac; rate=44100;' \
'https://www.google.com/speech-api/v2/recognize?output=json&lang=en-us&key=yourkey'

Caveats

Here are a few caveats you have to know about, should you decide to use this API in a production environment. (I don't recommend it)

  • The API only accepts up to ~10-15 seconds of audio.
  • Generating your own Speech API Key, you can only make 50 requests per day.

More Repositories

1

Cumulus

☁️ A SoundCloud player that lives in your menubar.
JavaScript
1,434
star
2

node-record-lpcm16

🎀 Records a 16-bit signed-integer linear pulse modulation code encoded audio file.
JavaScript
202
star
3

go-dicom

πŸ₯ DICOM Medical Image Parser in Go
Go
105
star
4

soundcloud-v2

Client for SoundCloud's API v2
JavaScript
11
star
5

npm-registry

NPM registry server
Go
10
star
6

systemdify

🐧 Automatically create a systemd unit file for your Node application
JavaScript
7
star
7

typewrite

πŸ“ A simple javascript file to simulate text input.
JavaScript
6
star
8

cli-progress-bar

πŸ•˜ A simple CLI progress bar, inspired by Gauge
JavaScript
5
star
9

beer-names

🍺 Get a unique random name that sounds like it could be a beer
JavaScript
5
star
10

ponysay

🐴 Ponysay, written in Go
Go
4
star
11

jest-snapshots-json-rest-api

Jest Snapshots serializer for JSON REST APIs.
JavaScript
4
star
12

Cumulus-nwjs

SoundCloud desktop application, using Node Webkit and Polymer
CSS
3
star
13

iec-bytes-parser

Parse IEC and SI bytes size string notations
TypeScript
3
star
14

Codex

Codex is a modular backup tool to help you backup and restore your application settings and dotfiles.
Ruby
3
star
15

node_analytics

Node Analytics with Server Sent Events
JavaScript
2
star
16

graph-editor

A very simple graph editor
JavaScript
2
star
17

All-Things-Go

All Things Talk interface written in Go
Go
2
star
18

grop

🚦 Get a random open port on the host machine
JavaScript
1
star
19

psc

Play a soundcloud URL in your Terminal.
Ruby
1
star
20

go-oauth

πŸ” OAuth HMAC-SHA1 signing in Go
Go
1
star
21

github-issue-templates

Testing GitHub issue templates
1
star
22

A-Brave-New-Web

Slides for our SFHTML5 Talk about Web Components and Polymer
CSS
1
star
23

connect-requires-json

βœ‹ Error middleware for empty and non-JSON request bodies
JavaScript
1
star
24

dag

A directed acyclical graph library
TypeScript
1
star
25

require-skip-cache

Require a module without adding it to the cached modules
JavaScript
1
star
26

conventional-github-releaser

1
star
27

rust-playground

Just messing around with πŸ¦€ nothing to see here
Rust
1
star
28

express-react-svg-to-png

Express SVG to PNG using a React component
JavaScript
1
star
29

AppLab-Parko

Parko application created for the AppLab hackathon in Kortrijk.
JavaScript
1
star
30

nostro.moe

https://nostro.moe
HTML
1
star
31

xtend-url

πŸ”— Append to an URL without the headache
JavaScript
1
star
32

haruhichan

πŸ™ A modern and sane API wrapper for Haruhichan
JavaScript
1
star
33

game-console-ui

Exploring interactive User Interfaces
TypeScript
1
star
34

fuelphp-eventbrite

A fuelphp wrapper for the use of the Eventbrite API
PHP
1
star
35

GSS

Gilles Style Sheets
JavaScript
1
star
36

express-api-skeleton

πŸ’€ My Express API skeleton
JavaScript
1
star
37

nextjs

JavaScript
1
star
38

gilles.demey.io

My personal website
HTML
1
star
39

desktop

backup of http://popcorntime.ml community, ripped by the bad .is guys (from today only popcorntime.ag please)
CSS
1
star
40

go-rsa-example

RSA cipher example in Go
Go
1
star
41

node-geo

🌐 A simple API wrapper around the Google maps geo coder.
JavaScript
1
star
42

nestor-docker

Docker container exposing cliffano/nestor as an entrypoint
1
star