Cheetah
Made in Vancouver, Canada by Picovoice
Cheetah is an on-device streaming speech-to-text engine. Cheetah is:
- Private; All voice processing runs locally.
- Accurate
- Compact and Computationally-Efficient
- Cross-Platform:
- Linux (x86_64), macOS (x86_64, arm64), and Windows (x86_64)
- Android and iOS
- Chrome, Safari, Firefox, and Edge
- Raspberry Pi (4, 3) and NVIDIA Jetson Nano
Table of Contents
AccessKey
AccessKey is your authentication and authorization token for deploying Picovoice SDKs, including Cheetah. Anyone who is using Picovoice needs to have a valid AccessKey. You must keep your AccessKey secret. You would need internet connectivity to validate your AccessKey with Picovoice license servers even though the voice recognition is running 100% offline.
AccessKey also verifies that your usage is within the limits of your account. Everyone who signs up for
Picovoice Console receives the Free Tier
usage rights described
here. If you wish to increase your limits, you can purchase a subscription plan.
Demos
Python Demos
Install the demo package:
pip3 install pvcheetahdemo
cheetah_demo_mic --access_key ${ACCESS_KEY}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console.
C Demos
If using SSH, clone the repository with:
git clone --recurse-submodules [email protected]:Picovoice/cheetah.git
If using HTTPS, clone the repository with:
git clone --recurse-submodules https://github.com/Picovoice/cheetah.git
Build the demo:
cmake -S demo/c/ -B demo/c/build && cmake --build demo/c/build
Run the demo:
./demo/c/build/cheetah_demo_mic -a ${ACCESS_KEY} -m ${MODEL_PATH} -l ${LIBRARY_PATH}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console, ${LIBRARY_PATH}
with the path to appropriate
library under lib, and ${MODEL_PATH}
to path to default model file
(or your custom one).
iOS Demos
To run the demo, go to demo/ios/CheetahDemo and run:
pod install
Replace let accessKey = "${YOUR_ACCESS_KEY_HERE}"
in the file ViewModel.swift with your AccessKey
.
Then, using Xcode, open the generated CheetahDemo.xcworkspace
and run the application.
Android Demo
Using Android Studio, open demo/android/CheetahDemo as an Android project and then run the application.
Replace "${YOUR_ACCESS_KEY_HERE}"
in the file MainActivity.java with your AccessKey
.
Flutter Demo
To run the Cheetah demo on Android or iOS with Flutter, you must have the Flutter SDK installed on your system. Once installed, you can run flutter doctor
to determine any other missing requirements for your relevant platform. Once your environment has been set up, launch a simulator or connect an Android/iOS device.
Before launching the app, use the copy_assets.sh script to copy the cheetah demo model file into the demo project. (NOTE: on Windows, Git Bash or another bash shell is required, or you will have to manually copy the context into the project.).
Replace "${YOUR_ACCESS_KEY_HERE}"
in the file main.dart with your AccessKey
.
Run the following command from demo/flutter to build and deploy the demo to your device:
flutter run
Go Demo
The demo requires cgo
, which on Windows may mean that you need to install a gcc compiler like MinGW to build it properly.
From demo/go run the following command from the terminal to build and run the file demo:
go run micdemo/cheetah_mic_demo.go -access_key "${ACCESS_KEY}"
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console.
For more information about Go demos go to demo/go.
React Native Demo
To run the React Native Porcupine demo app you will first need to set up your React Native environment. For this, please refer to React Native's documentation. Once your environment has been set up, navigate to demo/react-native to run the following commands:
For Android:
yarn android-install # sets up environment
yarn android-run # builds and deploys to Android
For iOS:
yarn ios-install # sets up environment
yarn ios-run
Node.js Demo
Install the demo package:
yarn global add @picovoice/cheetah-node-demo
With a working microphone connected to your device, run the following in the terminal:
cheetah-mic-demo --access_key ${ACCESS_KEY}
For more information about Node.js demos go to demo/nodejs.
Java Demos
The Cheetah Java demo is a command-line application that lets you choose between running Cheetah on an audio file or on real-time microphone input.
To try the real-time demo, make sure there is a working microphone connected to your device. Then invoke the following commands from the terminal:
cd demo/java
./gradlew build
cd build/libs
java -jar cheetah-mic-demo.jar -a ${ACCESS_KEY}
For more information about Java demos go to demo/java.
.NET Demo
Cheetah .NET demo is a command-line application that lets you choose between running Cheetah on an audio file or on real-time microphone input.
Make sure there is a working microphone connected to your device. From demo/dotnet/CheetahDemo run the following in the terminal:
dotnet run -c MicDemo.Release -- --access_key ${ACCESS_KEY}
Replace ${ACCESS_KEY}
with your Picovoice AccessKey
.
For more information about .NET demos, go to demo/dotnet.
Rust Demo
Cheetah Rust demo is a command-line application that lets you choose between running Cheetah on an audio file or on real-time microphone input.
Make sure there is a working microphone connected to your device. From demo/rust/micdemo run the following in the terminal:
cargo run --release -- --access_key ${ACCESS_KEY}
Replace ${ACCESS_KEY}
with your Picovoice AccessKey
.
For more information about Rust demos, go to demo/rust.
Web Demo
From demo/web run the following in the terminal:
yarn
yarn start
(or)
npm install
npm run start
Open http://localhost:5000
in your browser to try the demo.
SDKs
Python
Install the Python SDK:
pip3 install pvcheetah
Create an instance of the engine and transcribe audio in real-time:
import pvcheetah
handle = pvcheetah.create(access_key='${ACCESS_KEY}')
def get_next_audio_frame():
pass
while True:
partial_transcript, is_endpoint = handle.process(get_next_audio_frame())
if is_endpoint:
final_transcript = handle.flush()
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console.
C
Create an instance of the engine and transcribe audio in real-time:
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#include "pv_cheetah.h"
pv_cheetah_t *handle = NULL;
const pv_status_t status = pv_cheetah_init("${ACCESS_KEY}", "${MODEL_PATH}", 0.f, false, &handle);
if (status != PV_STATUS_SUCCESS) {
// error handling logic
}
extern const int16_t *get_next_audio_frame(void);
while (true) {
char *partial_transcript = NULL;
bool is_endpoint = false;
const pv_status_t status = pv_cheetah_process(
handle,
get_next_audio_frame(),
&partial_transcript,
&is_endpoint);
if (status != PV_STATUS_SUCCESS) {
// error handling logic
}
// do something with transcript
free(partial_transcript);
if (is_endpoint) {
char *final_transcript = NULL;
const pv_status_t status = pv_cheetah_flush(handle, &final_transcript);
if (status != PV_STATUS_SUCCESS) {
// error handling logic
}
// do something with transcript
free(final_transcript);
}
}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console and ${MODEL_PATH}
to path to
default model file (or your custom one). Finally, when done be sure to release
resources acquired using pv_cheetah_delete(handle)
.
iOS
The Cheetah iOS binding is available via CocoaPods. To import it into your iOS project, add the following line to your Podfile and run pod install
:
pod 'Cheetah-iOS'
Create an instance of the engine and transcribe audio in real-time:
import Cheetah
let modelPath = Bundle(for: type(of: self)).path(
forResource: "${MODEL_FILE}", // Name of the model file name for Cheetah
ofType: "pv")!
let cheetah = Cheetah(accessKey: "${ACCESS_KEY}", modelPath: modelPath)
func getNextAudioFrame() -> [Int16] {
// .. get audioFrame
return audioFrame;
}
while true {
do {
let partialTranscript, isEndpoint = try cheetah.process(getNetAudioFrame())
if isEndpoint {
let finalTranscript = try cheetah.flush()
}
} catch let error as CheetahError {
// handle error
} catch { }
}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console and ${MODEL_FILE}
with a custom trained model from
Picovoice Console or the default model.
Android
To include the package in your Android project, ensure you have included mavenCentral()
in your top-level build.gradle
file and then add the following to your app's build.gradle
:
dependencies {
implementation 'ai.picovoice:cheetah-android:${LATEST_VERSION}'
}
Create an instance of the engine and transcribe audio in real-time:
import ai.picovoice.cheetah.*;
final String accessKey = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
final String modelPath = "${MODEL_FILE}";
short[] getNextAudioFrame() {
// .. get audioFrame
return audioFrame;
}
try {
Cheetah cheetah = new Cheetah.Builder().setAccessKey(accessKey).setModelPath(modelPath).build(appContext);
String transcript = "";
while true {
CheetahTranscript transcriptObj = cheetah.process(getNextAudioFrame());
transcript += transcriptObj.getTranscript();
if (transcriptObj.getIsEndpoint()) {
CheetahTranscript finalTranscriptObj = cheetah.flush();
transcript += finalTranscriptObj.getTranscript();
}
};
} catch (CheetahException ex) { }
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console and ${MODEL_FILE}
with the default or custom trained model from console.
Flutter
Add the Cheetah Flutter plugin to your pub.yaml.
dependencies:
cheetah_flutter: ^<version>
Create an instance of the engine and transcribe audio in real-time:
import 'package:cheetah_flutter/cheetah.dart';
const accessKey = "{ACCESS_KEY}" // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
List<int> buffer = getAudioFrame();
try{
Cheetah _cheetah = await Cheetah.create(accessKey, '{CHEETAH_MODEL_PATH}');
String transcript = "";
while true {
CheetahTranscript partialResult = await _cheetah.process(getAudioFrame());
transcript += partialResult.transcript;
if (partialResult.isEndpoint) {
CheetahTranscript finalResult = await _cheetah.flush();
transcript += finalResult.transcript;
}
}
_cheetah.delete()
} on CheetahException catch (err) { }
Replace ${ACCESS_KEY}
with your AccessKey
obtained from Picovoice Console and ${CHEETAH_MODEL_PATH}
with the the path a custom trained model from Picovoice Console or the default model.
Go
Install the Go binding:
go get github.com/Picovoice/cheetah/binding/go
Create an instance of the engine and transcribe audio in real-time:
import . "github.com/Picovoice/cheetah/binding/go"
cheetah = NewCheetah{AccessKey: "${ACCESS_KEY}"}
err := cheetah.Init()
if err != nil {
// handle err init
}
defer cheetah.Delete()
func getNextFrameAudio() []int16{
// get audio frame
}
for {
partialTranscript, isEndpoint, err = cheetah.Process(getNextFrameAudio())
if isEndpoint {
finalTranscript, err = cheetah.Flush()
}
}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console. When done be sure to explicitly release the resources using cheetah.Delete()
.
React Native
The Cheetah React Native binding is available via NPM. Add it via the following command:
yarn add @picovoice/cheetah-react-native
Create an instance of the engine and transcribe an audio file:
import {Cheetah, CheetahErrors} from '@picovoice/cheetah-react-native';
const getAudioFrame = () => {
// get audio frames
}
try {
while (1) {
const cheetah = await Cheetah.create("${ACCESS_KEY}", "${MODEL_FILE}")
const {transcript, isEndpoint} = await cheetah.process(getAudioFrame())
if (isEndpoint) {
const {transcript} = await cheetah.flush()
}
}
} catch (err: any) {
if (err instanceof CheetahErrors) {
// handle error
}
}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console and ${MODEL_FILE}
with the default or custom trained model from console. When done be sure to explicitly release the resources using cheetah.delete()
.
Node.js
Install the Node.js SDK:
yarn add @picovoice/cheetah-node
Create instances of the Cheetah class:
const Cheetah = require("@picovoice/cheetah-node");
const accessKey = "${ACCESS_KEY}"; // Obtained from the Picovoice Console (https://console.picovoice.ai/)
const endpointDurationSec = 0.2;
const handle = new Cheetah(accessKey);
function getNextAudioFrame() {
// ...
return audioFrame;
}
while (true) {
const audioFrame = getNextAudioFrame();
const [partialTranscript, isEndpoint] = handle.process(audioFrame);
if (isEndpoint) {
finalTranscript = handle.flush()
}
}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console.
When done, be sure to release resources using release()
:
handle.release();
Java
Create an instance of the engine with the Cheetah Builder class and transcribe audio in real-time:
import ai.picovoice.cheetah.*;
final String accessKey = "..."; // AccessKey provided by Picovoice Console (https://console.picovoice.ai/)
short[] getNextAudioFrame() {
// .. get audioFrame
return audioFrame;
}
String transcript = "";
try {
Cheetah cheetah = new Cheetah.Builder().setAccessKey(accessKey).build();
while true {
CheetahTranscript transcriptObj = cheetah.process(getNextAudioFrame());
transcript += transcriptObj.getTranscript();
if (transcriptObj.getIsEndpoint()) {
CheetahTranscript finalTranscriptObj = cheetah.flush();
transcript += finalTranscriptObj.getTranscript();
}
}
cheetah.delete();
} catch (CheetahException ex) { }
.NET
Install the .NET SDK using NuGet or the dotnet CLI:
dotnet add package Cheetah
The SDK exposes a factory method to create instances of the engine as below:
using Pv;
const string accessKey = "${ACCESS_KEY}";
Cheetah handle = Cheetah.Create(accessKey);
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console.
When initialized, the valid sample rate is given by handle.SampleRate
. Expected frame length (number of audio samples in an input array) is handle.FrameLength
. The engine accepts 16-bit linearly-encoded PCM and operates on single-channel audio.
short[] GetNextAudioFrame()
{
// .. get audioFrame
return audioFrame;
}
string transcript = "";
while(true)
{
CheetahTranscript transcriptObj = handle.Process(GetNextAudioFrame());
transcript += transcriptObj.Transcript;
if (transcriptObj.IsEndpoint) {
CheetahTranscript finalTranscriptObj = handle.Flush();
transcript += finalTranscriptObj.Transcript;
}
}
Cheetah will have its resources freed by the garbage collector, but to have resources freed immediately after use, wrap it in a using statement:
using(Cheetah handle = Cheetah.Create(accessKey))
{
// .. Cheetah usage here
}
Rust
First you will need Rust and Cargo installed on your system.
To add the cheetah library into your app, add pv_cheetah
to your app's Cargo.toml
manifest:
[dependencies]
pv_cheetah = "*"
Create an instance of the engine using CheetahBuilder
instance and transcribe an audio file:
use cheetah::CheetahBuilder;
fn next_audio_frame() -> Vec<i16> {
// get audio frame
}
let access_key = "${ACCESS_KEY}"; // AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)
let cheetah: Cheetah = CheetahBuilder::new().access_key(access_key).init().expect("Unable to create Cheetah");
if let Ok(cheetahTranscript) = cheetah.process(&next_audio_frame()) {
println!("{}", cheetahTranscript.transcript)
if cheetahTranscript.is_endpoint {
if let Ok(cheetahTranscript) = cheetah.flush() {
println!("{}", cheetahTranscript.transcript)
}
}
}
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console.
Web
Install the web SDK using yarn:
yarn add @picovoice/cheetah-web
or using npm:
npm install --save @picovoice/cheetah-web
Create an instance of the engine using CheetahWorker
and transcribe an audio file:
import { CheetahWorker } from "@picovoice/cheetah-web";
import cheetahParams from "${PATH_TO_BASE64_CHEETAH_PARAMS}";
let transcript = "";
function transcriptCallback(cheetahTranscript: CheetahTranscript) {
transcript += cheetahTranscript.transcript;
if (cheetahTranscript.isEndpoint) {
transcript += "\n";
}
}
function getAudioData(): Int16Array {
... // function to get audio data
return new Int16Array();
}
const cheetah = await CheetahWorker.create(
"${ACCESS_KEY}",
transcriptCallback,
{ base64: cheetahParams }
);
for (;;) {
cheetah.process(getAudioData());
// break on some condition
}
cheetah.flush(); // runs transcriptionCallback on remaining data.
Replace ${ACCESS_KEY}
with yours obtained from Picovoice Console. Finally, when done release the resources using cheetah.release()
.
Releases
v1.1.0 β August 11th, 2022
- added true-casing by default for transcription results
- added option to enable automatic punctuation insertion
- Cheetah Web SDK release
v1.0.0 β January 25th, 2022
- Initial release.