• Stars
    star
    165
  • Rank 228,906 (Top 5 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created about 3 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

On-device voice activity detection (VAD) powered by deep learning

Cobra

GitHub

PyPI Maven Central

Crates.io

Made in Vancouver, Canada by Picovoice

Twitter URL

YouTube Channel Views

Cobra is a highly-accurate and lightweight voice activity detection (VAD) engine.

Table of Contents

Demos

Python Demos

Install the demo package:

sudo pip3 install pvcobrademo

With a working microphone connected to your device run the following in the terminal:

cobra_demo_mic --access_key ${AccessKey}

Replace ${AccessKey} with your AccessKey obtained from Picovoice Console. Cobra starts processing the audio input from the microphone in realtime and outputs to the terminal when it detects any voice activities.

For more information about the Python demos go to demo/python.

C Demos

Build the demo:

cmake -S demo/c/ -B demo/c/build && cmake --build demo/c/build --target cobra_demo_mic

To list the available audio input devices:

./demo/c/build/cobra_demo_mic -s

To run the demo:

./demo/c/build/cobra_demo_mic -l ${LIBRARY_PATH} -a ${ACCESS_KEY} -d ${AUDIO_DEVICE_INDEX}

Replace ${LIBRARY_PATH} with path to appropriate library available under lib, Replace ${ACCESS_KEY} with AccessKey obtained from Picovoice Console, and ${INPUT_AUDIO_DEVICE} with the index of your microphone device.

For more information about C demos go to demo/c.

Android Demos

Using Android Studio, open demo/android/Activity as an Android project and then run the application. Replace String ACCESS_KEY = "..." inside MainActivity.java with your AccessKey generated by Picovoice Console.

For more information about Android demos go to demo/android.

iOS demos

Run the following from this directory to install the Cobra-iOS CocoaPod:

pod install

Replace let ACCESS_KEY = "..." inside ViewModel.swift with yours obtained from Picovoice Console.

Then, using Xcode, open the generated CobraDemo.xcworkspace and run the application. Press the start button and start talking. The background will change colour while you're talking.

For more information about iOS demos go to demo/ios.

Web Demos

From demo/web run the following in the terminal:

yarn
yarn start

(or)

npm install
npm run start

Open http://localhost:5000 in your browser to try the demo.

Rust Demos

From demo/rust/micdemo build and run the demo:

cargo run --release -- --access_key ${ACCESS_KEY}

For more information about Rust demos go to demo/rust.

SDKs

Python

Install the Python SDK:

pip3 install pvcobra

The SDK exposes a factory method to create instances of the engine:

import pvcobra

handle = pvcobra.create(access_key=${AccessKey})

where ${AccessKey} is an AccessKey which should be obtained from Picovoice Console. When initialized, valid sample rate can be obtained using handle.sample_rate. The required frame length (number of audio samples in an input array) is handle.frame_length. The object can be used to monitor incoming audio as follows:

def get_next_audio_frame():
    pass

while True:
    voice_probability = handle.process(get_next_audio_frame())

Finally, when done be sure to explicitly release the resources using handle.delete().

C

include/pv_cobra.h header file contains relevant information. Build an instance of the object:

    pv_cobra_t *handle = NULL;
    pv_status_t status = pv_cobra_init(${ACCESS_KEY}, &handle);
    if (status != PV_STATUS_SUCCESS) {
        // error handling logic
    }

Replace ${ACCESS_KEY} with the AccessKey obtained from Picovoice Console. Now the handle can be used to monitor incoming audio stream. Cobra accepts single channel, 16-bit linearly-encoded PCM audio. The sample rate can be retrieved using pv_sample_rate(). Finally, Cobra accepts input audio in consecutive chunks (aka frames) the length of each frame can be retrieved using pv_cobra_frame_length().

extern const int16_t *get_next_audio_frame(void);

while (true) {
    const int16_t *pcm = get_next_audio_frame();
    float is_voiced = 0.f;
    const pv_status_t status = pv_cobra_process(handle, pcm, &is_voiced);
    if (status != PV_STATUS_SUCCESS) {
        // error handling logic
    }
}

Finally, when done be sure to release the acquired resources:

pv_cobra_delete(handle);

Android

Create an instance of the engine

import ai.picovoice.cobra.Cobra;
import ai.picovoice.cobra.CobraException;

String accessKey = // .. AccessKey provided by Picovoice Console (https://console.picovoice.ai/)
try {
    handle = new Cobra(accessKey);
} catch (CobraException e) {
    // handle error
}

When initialized, valid sample rate can be obtained using handle.getSampleRate(). The required frame length (number of audio samples in an input array) is handle.getFrameLength(). The object can be used to monitor incoming audio as follows:

short[] getNextAudioFrame(){

while(true) {
    try {
        final float voiceProbability = handle.process(getNextAudioFrame());
    } catch (CobraException e) { }
}

Finally, when done be sure to explicitly release the resources using handle.delete().

iOS

To import the Cobra iOS binding into your project, add the following line to your Podfile and run pod install:

pod 'Cobra-iOS'

Create an instance of the engine

import Cobra

let accessKey : String = // .. AccessKey provided by Picovoice Console (https://console.picovoice.ai/)
do {
    handle = try Cobra(accessKey: accessKey)
} catch { }

func getNextAudioFrame() -> [Int16] {
    // .. get audioFrame
    return audioFrame;
}

while true {
    do {
        let voiceProbability = try handle.process(getNextAudioFrame())
    } catch { }
}

Finally, when done be sure to explicitly release the resources using handle.delete().

Web

Install the web SDK using yarn:

yarn add @picovoice/cobra-web

or using npm:

npm install --save @picovoice/cobra-web

Create an instance of the engine using CobraWorker and run the VAD on an audio input stream:

import { CobraWorker } from "@picovoice/cobra-web";

function voiceProbabilityCallback(voiceProbability: number) {
  ... // use voice probability figure
}

function getAudioData(): Int16Array {
  ... // function to get audio data
  return new Int16Array();
}

const cobra = await CobraWorker.create(
  "${ACCESS_KEY}",
  voiceProbabilityCallback
);

for (; ;) {
  cobra.process(getAudioData());
  // break on some condition
}

Replace ${ACCESS_KEY} with yours obtained from Picovoice Console.

When done, release the resources allocated to Cobra using cobra.release().

Rust

Create an instance of the engine and detect voice activity:

use cobra::Cobra;

let cobra = Cobra::new("${ACCESS_KEY}");

fn next_audio_frame() -> Vec<i16> {
    // get audio frame
}

loop {
    if let Ok(voice_probability) = cobra.process(&next_audio_frame()) {
      // ...
    }
}

Releases

v1.2.0 January 27th, 2023

  • updated Cobra engine for improved accuracy and performance
  • iOS minimum requirement moved to iOS 11.0
  • minor bug fixes

v1.1.0 January 21st, 2022

  • Improved types for web binding
  • Various bug fixes and improvments

v1.0.0 October 8th, 2021

  • Initial release.

More Repositories

1

porcupine

On-device wake word detection powered by deep learning
Python
3,685
star
2

rhino

On-device Speech-to-Intent engine powered by deep learning
Python
616
star
3

speech-to-text-benchmark

speech to text benchmark framework
Python
603
star
4

cheetah

On-device streaming speech-to-text engine powered by deep learning
Python
582
star
5

picovoice

On-device voice assistant platform powered by deep learning
Python
564
star
6

leopard

On-device speech-to-text engine powered by deep learning
Python
427
star
7

web-voice-processor

A library for real-time voice processing in web browsers
TypeScript
195
star
8

picollm

On-device LLM Inference Powered by X-Bit Quantization
Python
158
star
9

wake-word-benchmark

wake word engine benchmark framework
Python
131
star
10

pvrecorder

Cross-platform audio recorder designed for real-time speech audio processing
C
78
star
11

pico-cookbook

Recipes for on-device voice AI and local LLM
JavaScript
62
star
12

koala

On-device noise suppression powered by deep learning
Python
59
star
13

orca

On-device streaming text-to-speech engine powered by deep learning
TypeScript
43
star
14

octopus

On-device Speech-to-Index engine powered by deep learning
Python
34
star
15

flutter-voice-processor

Flutter audio recording plugin designed for real-time speech audio processing
Dart
29
star
16

eagle

On-device speaker recognition engine powered by deep learning
Python
23
star
17

falcon

On-device speaker diarization powered by deep learning
Python
22
star
18

react-native-voice-processor

React Native audio recording package designed for real-time speech audio processing
TypeScript
21
star
19

speech-to-intent-benchmark

benchmark for Speech-to-Intent engines
Python
15
star
20

llm-compression-benchmark

LLM Compression Benchmark
Python
15
star
21

browser-extension

Picovoice Browser Extension
JavaScript
14
star
22

voice-activity-benchmark

Voice activity engine benchmark framework
Python
12
star
23

speaker-diarization-benchmark

Speaker diarization benchmark framework
Python
10
star
24

serverless-picollm

LLM Inference on AWS Lambda
Python
9
star
25

picovoice-arduino-en

Picovoice SDK for Arduino boards - English language
C
7
star
26

ios-voice-processor

Asynchronous iOS audio recording library designed for real-time speech audio processing
Swift
5
star
27

unity-voice-processor

Unity audio recording package designed for real-time speech audio processing
C#
5
star
28

android-voice-processor

Asynchronous Android audio recording library designed for real-time speech audio processing
Java
4
star
29

tts-latency-benchmark

Text-to-Speech Latency Benchmark
Python
3
star
30

speaker-recognition-benchmark

Speaker recongnition benchmark framework
Python
3
star
31

picovoice-arduino-fa

Picovoice SDK for Arduino boards - Persian language
C
2
star
32

porcupine-arduino-en

Porcupine SDK for Arduino boards - English language
C
2
star
33

serverless-leopard

Python
2
star
34

noise-suppression-benchmark

Benchmark for noise suppression engines
Python
1
star
35

picovoice-arduino-es

Picovoice SDK for Arduino boards - Spanish language
C
1
star
36

speech-to-index-benchmark

Speech-to-Index (voice search) benchmark framework
Python
1
star