• Stars
    star
    637
  • Rank 70,628 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created 12 months ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Remote voice satellite using Wyoming protocol

Wyoming Satellite

Remote voice satellite using the Wyoming protocol.

See the tutorial to build a satellite using a Raspberry Pi Zero 2 W and a ReSpeaker 2Mic HAT.

Video tutorials:


Requires:

  • Python 3.7+ (tested on 3.9+)
  • A microphone

Installation

Install the necessary system dependencies:

sudo apt-get install python3-venv python3-pip

Then run the install script:

script/setup

The examples below uses alsa-utils to record and play audio:

sudo apt-get install alsa-utils

Remote Wake Word Detection

Run the satellite with remote wake word detection:

cd wyoming-satellite/
script/run \
  --name 'my satellite' \
  --uri 'tcp://0.0.0.0:10700' \
  --mic-command 'arecord -r 16000 -c 1 -f S16_LE -t raw' \
  --snd-command 'aplay -r 22050 -c 1 -f S16_LE -t raw'

This will use the default microphone and playback devices.

Use arecord -D <DEVICE> ... if you need to use a different microphone (list them with arecord -L and prefer plughw: devices). Use aplay -D <DEVICE> ... if you need to use a different playback device (list them with aplay -L and prefer plughw: devices).

Add --debug to print additional logs.

In the Home Assistant settings "Devices & services" page, you should see the satellite discovered automatically. If not, click "Add Integration", choose "Wyoming Protocol", and enter the IP address of the satellite (port 10700).

Audio will be continuously streamed to the server, where wake word detection, etc. will occur.

Voice Activity Detection

Rather than always streaming audio to Home Assistant, the satellite can wait until speech is detected.

NOTE: This will not work on the 32-bit version of Raspberry Pi OS.

Install the dependencies for silero VAD:

.venv/bin/pip3 install 'pysilero-vad==1.0.0'

Run the satellite with VAD enabled:

script/run \
  ... \
  --vad

Now, audio will only start streaming once speech has been detected.

Local Wake Word Detection

Install a wake word detection service, such as wyoming-openwakeword and start it:

cd wyoming-openwakeword/
script/run \
  --uri 'tcp://0.0.0.0:10400' \
  --preload-model 'ok_nabu'

Add --debug to print additional logs. See --help for more information.

Included wake words are:

  • ok_nabu
  • hey_jarvis
  • alexa
  • hey_mycroft
  • hey_rhasspy

Community trained wake words are also available and can be included with --custom-model-dir <DIR> where <DIR> contains .tflite file(s).

Next, start the satellite with some additional arguments:

cd wyoming-satellite/
script/run \
  --name 'my satellite' \
  --uri 'tcp://0.0.0.0:10700' \
  --mic-command 'arecord -r 16000 -c 1 -f S16_LE -t raw' \
  --snd-command 'aplay -r 22050 -c 1 -f S16_LE -t raw' \
  --wake-uri 'tcp://127.0.0.1:10400' \
  --wake-word-name 'ok_nabu'

Audio will only be streamed to the server after the wake word has been detected.

Once a wake word has been detected, it can not be detected again for several seconds (called the "refractory period"). You can change this with --wake-refractory-seconds <SECONDS>.

Note that --vad is unnecessary when connecting to a local instance of openwakeword.

Sounds

You can play a WAV file when the wake word is detected (locally or remotely), and when speech-to-text has completed:

  • --awake-wav <WAV> - played when the wake word is detected
  • --done-wav <WAV> - played when the voice command is finished

If you want to play audio files other than WAV, use event commands. Specifically, the --detection-command to replace --awake-wav and --transcript-command to replace --done-wav.

Audio Enhancements

Install the dependencies for webrtc:

.venv/bin/pip3 install 'webrtc-noise-gain==1.2.3'

Run the satellite with automatic gain control and noise suppression:

script/run \
  ... \
  --mic-auto-gain 5 \
  --mic-noise-suppression 2

Automatic gain control is between 0-31 dbFS, which 31 being the loudest. Noise suppression is from 0-4, with 4 being maximum suppression (may cause audio distortion).

You can also use --mic-volume-multiplier X to multiply all audio samples by X. For example, using 2 for X will double the microphone volume (but may cause audio distortion). The corresponding --snd-volume-multiplier does the same for audio playback.

Event Commands

Satellites can respond to events from the server by running commands:

  • --startup-command - run when satellite starts (no stdin)
  • --detect-command - wake word detection has started, but not detected yet (no stdin)
  • --streaming-start-command - audio has started streaming to server (no stdin)
  • --streaming-stop-command - audio has stopped streaming to server (no stdin)
  • --detection-command - wake word is detected (wake word name on stdin)
  • --transcript-command - speech-to-text transcript is returned (text on stdin)
  • --stt-start-command - user started speaking (no stdin)
  • --stt-stop-command - user stopped speaking (no stdin)
  • --synthesize-command - text-to-speech text is returned (text on stdin)
  • --tts-start-command - text-to-speech response started streaming from server (no stdin)
  • --tts-stop-command - text-to-speech response stopped streaming from server. Can still being played by snd service (no stdin)
  • --tts-played-command - text-to-speech audio finished playing (no stdin)
  • --error-command - an error was sent from the server (text on stdin)
  • --connected-command - satellite connected to server
  • --disconnected-command - satellite disconnected from server

For more advanced scenarios, use an event service (--event-uri). See wyoming_satellite/example_event_client.py for a basic client that just logs events.

More Repositories

1

piper

A fast, local neural text to speech system
C++
6,243
star
2

rhasspy

Offline private voice assistant for many human languages
Shell
2,361
star
3

larynx

End to end text to speech system using gruut and onnx
Python
822
star
4

rhasspy3

An open source voice assistant toolkit for many human languages
Python
300
star
5

gruut

A tokenizer, text cleaner, and phonemizer for many human languages.
Python
278
star
6

wyoming

Peer-to-peer protocol for voice assistants
Python
118
star
7

snowboy-seasalt

Web interface for creating snowboy personal wake words locally
JavaScript
117
star
8

wyoming-openwakeword

Wyoming protocol server for openWakeWord wake word detection system
Python
102
star
9

piper-recording-studio

Local voice recording for creating Piper datasets
JavaScript
95
star
10

piper-phonemize

C++ library for converting text to phonemes for Piper
C++
85
star
11

wyoming-faster-whisper

Wyoming protocol server for faster whisper speech to text system
Python
82
star
12

gruut-ipa

Python library for manipulating pronunciations using the International Phonetic Alphabet (IPA)
Python
78
star
13

wyoming-addons

Docker builds for Home Assistant add-ons using Wyoming protocol
Dockerfile
64
star
14

hassio-addons

Add-ons for Home Assistant's Hass.IO
Dockerfile
62
star
15

rhasspy-silence

Silence detection in audio stream using webrtcvad
Python
46
star
16

rhasspy-wake-raven

Wake word detection engine based on Snips Personal Wakeword Detector
Python
44
star
17

wyoming-piper

Wyoming protocol server for Piper text to speech system
Python
39
star
18

pymicro-vad

Self-contained voice activity detector
C++
22
star
19

espeak-phonemizer

Uses ctypes and libespeak-ng to transform test into IPA phonemes
Python
20
star
20

piper-sample-generator

Generate samples using Piper to train wake word models
Python
17
star
21

rhasspy-hermes-app

Helper library to create voice apps for Rhasspy in Python using the Hermes protocol
Python
17
star
22

openWakeWord-cpp

C++ version of openWakeWord
C++
17
star
23

pl_deepspeech-jaco

Polish profile for Rhasspy using Jaco's DeepSpeech model
Python
16
star
24

wyoming-snowboy

Wyoming protocol server for snowboy wake word detection system
Python
16
star
25

webrtc-noise-gain

Tiny wrapper around webrtc-audio-processing for noise suppression/auto gain only
C++
15
star
26

rhasspy-asr-kaldi

Speech to text library for Rhasspy using Kaldi
Python
14
star
27

fa_kaldi-rhasspy

Persian Kaldi profile for Rhasspy built from open speech data
Shell
14
star
28

rhasspy-speakers-cli-hermes

MQTT service for Rhasspy audio output with external program using the Hermes protocol
Python
12
star
29

glow-tts-train

An implementation of GlowTTS designed to work with Gruut
Python
12
star
30

phonetisaurus-pypi

Python wrapper for phonetisaurus grapheme to phoneme tool
Python
12
star
31

glow-speak

Neural text to speech system that uses eSpeak as a text/phoneme front-end
Python
12
star
32

rhasspy-microphone-cli-hermes

Records audio from an external program and publishes WAV chunks according to the Hermes protocol
Python
12
star
33

larynx_old

Text to speech system based on MozillaTTS and gruut
Python
12
star
34

wyoming-snd-external

Wyoming protocol server that calls an external program to play audio
Python
11
star
35

rhasspy-nlu

Natural language understanding library for Rhasspy
Python
11
star
36

wyoming-whisper-cpp

Wyoming protocol server for whisper.cpp
C++
10
star
37

wyoming-porcupine1

Wyoming protocol server for porcupine1 wake word detection system
Python
10
star
38

pysilero-vad

Mike/Projects/pysilero-vad.git
Python
10
star
39

rhasspy-satellite

Collection of Rhasspy libraries for satellites only
Shell
9
star
40

snowman-enroll

Custom wake word creation for snowboy using snowman
C++
9
star
41

tts-prompts

Phonetically balanced text to speech sentences
9
star
42

wyoming-vosk

Wyoming protocol server for the vosk speech to text system
Python
9
star
43

wyoming-mic-external

Wyoming protocol server that calls an external program to get microphone input
Python
9
star
44

rhasspy-client

Client library for talking to remote Rhasspy server
Python
7
star
45

ipa2kaldi

Tool for creating Kaldi nnet3 recipes using the International Phonetic Alphabet (IPA)
Python
7
star
46

kaldi-align

A basic forced aligner using Kaldi and gruut
Python
7
star
47

hifi-gan-train

Implementation of Hi-Fi GAN vocoder
Python
6
star
48

fr_kaldi-rhasspy

French Kaldi profile for Rhasspy built from open speech data
Python
6
star
49

cs_kaldi-rhasspy

Czech Kaldi profile for Rhasspy built from open speech data
Python
6
star
50

it_kaldi-rhasspy

Italian Kaldi profile for Rhasspy built from open speech data
Python
6
star
51

sv_kaldi-rhasspy

Swedish Kaldi profile for Rhasspy built from open speech data
Python
6
star
52

rhasspy-microphone-pyaudio-hermes

MQTT service for audio input from PyAudio using Hermes protocol
Shell
6
star
53

wav2mel

Transform audio files into mel spectrograms for text-to-speech model training
Python
6
star
54

phonemes2ids

Flexible tool for assigning integer ids to phonemes
Python
6
star
55

rhasspy-tts-cli-hermes

MQTT service for text to speech with external program using the Hermes protocol
Shell
6
star
56

pyspeex-noise

Noise suppression and automatic gain with speex
C++
5
star
57

piper-samples

Samples for Piper text to speech system
Python
5
star
58

rhasspy-asr-deepspeech-hermes

MQTT service for Rhasspy using Mozilla's DeepSpeech with the Hermes protocol
Python
5
star
59

dataset-voice-kerstin

Voice dataset for native female German speaker
5
star
60

wyoming-handle-external

Wyoming protocol server that calls an external program to handle intents
Python
4
star
61

de_larynx-thorsten

German voice for Larynx based on the thorsten dataset
4
star
62

vits-train

Training for VITS text to speech system
Python
4
star
63

rhasspy-server-hermes

Web server interface to Rhasspy with Hermes back-end
JavaScript
4
star
64

wiktionary2dict

Tool for extracting IPA pronunciations from Wiktionary XML dump
Python
4
star
65

nl_larynx-rdh

Dutch text to speech voice for Larynx built from rdh dataset
3
star
66

es_kaldi-rhasspy

Spanish Kaldi profile for Rhasspy built from open speech data
Python
3
star
67

ru_kaldi-rhasspy

Russian Kaldi profile for Rhasspy built from open speech data
Python
3
star
68

es_larynx-css10

Spanish text to speech voice for Larynx built from CSS10 corpus
3
star
69

rhasspy-rasa-nlu-hermes

MQTT service for natural language understanding in Rhasspy using Rasa NLU with the Hermes protocol
Python
3
star
70

vox-check

Website for contributing voice recordings and vertifications
JavaScript
3
star
71

energy-vad

Energy-based voice activity detector with no dependencies
Python
3
star
72

wyoming-microwakeword

Wyoming server for microWakeWord
Python
2
star
73

voicekit-fpc

False positive detector for the Voice Kit
C++
2
star
74

rhasspy-hermes

Python classes for Hermes protocol
HTML
2
star
75

rhasspy-asr-pocketsphinx

Speech to text for Rhasspy using Pocketsphinx
Python
2
star
76

speexdsp-cli

Tiny program to filter an audio stream through speex for noise suppression
C++
2
star
77

bemused-client

Streaming TFLite keyword detector
Python
2
star
78

rhasspy-homeassistant-hermes

MQTT service for handling intents using Home Assistant
Python
2
star
79

es_deepspeech-jaco

Spanish profile for Rhasspy using Jaco's DeepSpeech model
Python
2
star
80

ru_larynx-nikolaev

Russian text to speech voice for Larynx built from M-AI Labs corpus
2
star
81

pymicro-features

Get audio features for microWakeWord and microVAD
C
1
star
82

wyoming-sdl2

Wyoming protocol server for audio input/output with SDL2
Python
1
star
83

unicode-rbnf

A pure Python implementation of ICU's rule-based number format engine
Python
1
star
84

en-us_larynx-kathleen

English voice for Larynx based on the kathleen dataset
1
star
85

rhasspy-web-vue

Vue-based web interface to Rhasspy
JavaScript
1
star
86

mitlm

Modified version of MIT language modeling toolkit
C++
1
star
87

rhasspy-wake-porcupine-hermes

MQTT service for wake word detection using the Hermes protocol
Python
1
star
88

rhasspy-asr-deepspeech

Rhasspy wrapper for Deepspeech ASR
Python
1
star
89

rhasspy-python-template

Template for Rhasspy repositories with Python code
1
star
90

rhasspy-asr-vosk-hermes

MQTT service for speech to text with Vosk using Hermes protocol
Python
1
star
91

ar_kaldi-rhasspy

Kaldi profile for Arabic trained from open speech data
Python
1
star
92

models

Centralized place to store model files
1
star
93

rhasspy-tag-action

Python
1
star
94

rhasspy-wake-precise-hermes

MQTT wake word service for Rhasspy with Mycroft Precise using the Hermes protocol
Python
1
star
95

rhasspy-skills

Collection of custom skills for Rhasspy
Python
1
star
96

rhasspy-wake-snowboy-hermes

MQTT service for wake word detection with snowboy using Hermes protocol
Python
1
star
97

rhasspy-remote-http-hermes

MQTT service to use remote Rhasspy server with the Hermes protocol
Python
1
star
98

rhasspy-junior

A single-file voice assistant framework
Python
1
star
99

rhasspy-tts-wavenet-hermes

MQTT service for text to speech using Google's Wavenet and the Hermes protocol
Python
1
star
100

rhasspy-profile

Python library for Rhasspy settings
Python
1
star