• Stars
    star
    179
  • Rank 214,039 (Top 5 %)
  • Language
    Python
  • License
    MIT License
  • Created over 4 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Docker image for Mozilla TTS server

NOTE: Please see coqui-docker for docker images of Coqui TTS (Mozilla TTS's successor)


Mozilla TTS

Multi-platform Docker images for Mozilla TTS. Many thanks to erogol and the community!

Screenshot of web interface

Supported languages (see Released Models):

Supported platforms:

  • x86_64
    • GPU is not supported (no CUDA or GPU-enabled PyTorch)
    • Your CPU must support AVX instructions (no Celeron, etc.)
  • armv7
    • Raspberry Pi 2/3/4 (32-bit)
  • arm64
    • Raspberry Pi 2/3/4 (64-bit)

RAM Limitations

If you're running on a Raspberry Pi with only 1 GB of RAM, you may be unable to load some of the larger models without increasing your swap space. To do this, simply edit the /etc/dphys-swapfile file (with sudo) and increase CONF_SWAPSIZE (1000 is recommended, value is MB). Make sure to reboot after editing this file.

Using

$ docker run -it -p 5002:5002 synesthesiam/mozillatts:<LANGUAGE>

where <LANGUAGE> is one of the supported languages (en, es, fr, de). If no language is given, U.S. English is used.

Visit http://localhost:5002 for web interface.

Do an HTTP GET at http://localhost:5002/api/tts?text=your%20sentence to get WAV audio back:

$ curl -G --output - \
    --data-urlencode 'text=Welcome to the world of speech synthesis!' \
    'http://localhost:5002/api/tts' | \
    aplay

HTTP POST is also supported:

$ curl -X POST -H 'Content-Type: text/plain' --output - \
    --data 'Welcome to the world of speech synthesis!' \
    'http://localhost:5002/api/tts' | \
    aplay

A /process endpoint is available for compatibility with MaryTTS. Expose the correct port (59125) for maximum compatibility:

$ docker run -it -p 59125:5002 synesthesiam/mozillatts

You should now be able to use software like the Home Assistant MaryTTS integration. Note that only the INPUT_TEXT field is actually used.

Custom Model

The Docker image is usually built with buildx for multi-platform support. If you just want to build an image for one platform, you can do this:

$ NOBUILDX=1 LANGUAGE=en scripts/build-docker.sh

When you set a LANGUAGE, the build script looks in model/<LANGUAGE>. These files should exist:

  • model/<LANGUAGE>/config.json
  • model/<LANGUAGE>/checkpoint.pth.tar (any name that ends in .pth.tar is fine)
  • model/<LANGUAGE>/scale_stats.npy (optional)

Optionally, you may also include a vocoder:

  • model/<LANGUAGE>/vocoder/config.json
  • model/<LANGUAGE>/vocoder/checkpoint.pth.tar (any name that ends in .pth.tar is fine)
  • model/<LANGUAGE>/vocoder/scale_stats.npy (optional)

If the sample rates between the model and vocoder don't match, the audio will be interpolated.

Docker Download Cache

When building the Docker image, the download directory may contain architecture-specific Python wheels. The download/amd64 directory, for example, will be used with pip's --find-links on x86_64 systems.

The download/shared directory is used for all architectures. If a requirements.txt file is present there, it is used to install dependencies for MozillaTTS. This can be used to exclude Tensorflow, etc., or to use specific package versions.

Use Docker buildx

To use buildx, you'll need to enable experimental features in the Docker CLI and then set up a private registry:

$ docker run -d -p 15555:5000 --name registry --restart=always registry:2

This registry runs on port 15555. Next, create a configuration file at /etc/docker/buildx.conf with this inside:

[registry."localhost:15555"]
  http = true
  insecure = true

Note the same port number (15555). Finally, run the following commands to create a builder:

$ docker run --rm --privileged multiarch/qemu-user-static --reset -p yes
$ docker buildx create --config /etc/docker/buildx.conf --use --name mybuilder
$ docker buildx use mybuilder
$ docker buildx inspect --bootstrap

For some reason, these have to be run again after every reboot and will sometimes require removing the builder first.

If all is well, you can build for specific platforms like this:

$ PLATFORMS=linux/arm/v7 LANGUAGE=en DOCKER_REGISTRY=localhost:15555 scripts/build-docker.sh

Note that the limiting factor for most platforms is a compiled PyTorch wheel. Pre-built wheels are available here for ARM and PyTorch 1.6.0. Put wheels in the download directory before building.

More Repositories

1

voice2json

Command-line tools for speech and intent recognition on Linux
Python
1,085
star
2

rhasspy

Rhasspy voice assistant for offline home automation
HTML
942
star
3

opentts

Open Text to Speech Server
Python
893
star
4

homeassistant-satellite

Streaming audio satellite for Home Assistant
Python
187
star
5

old-custom-components

A voice assistant toolkit for Home Assistant
Python
75
star
6

magicpy

An autostereogram (MagicEye) image generator written in Python
Python
70
star
7

coqui-docker

Docker images for Coqui AI
Shell
55
star
8

hassio-addons

My Hass.IO add-ons
Shell
43
star
9

docker-marytts

MaryTTS text to speech server and a collection of voices for various languages
Shell
33
star
10

voice-recorder

Simple tkinter application for recorded voice samples with text prompts
Python
17
star
11

eyecode

Python library for analyzing gaze data from programmers
JavaScript
17
star
12

jsgf-gen

Tool for generating tagged sentences from JSGF grammars
Java
14
star
13

voice2json-profiles

Speech models and artifacts for voice2json
Python
11
star
14

jsgf2fst

Python
9
star
15

pt-br_pocketsphinx-cmu

Portuguese voice2json profile based on Pocketsphinx
Python
7
star
16

zh-cn_pocketsphinx-cmu

Mandarin voice2json profile based on Pocketsphinx
Python
7
star
17

homeassistant-pipeline

Websocket client for Assist audio pipeline
Python
7
star
18

en-us_deepspeech-mozilla

U.S. English profile for Mozilla DeepSpeech
Python
7
star
19

openwakeword-satellite

Basic satellite for Home Assistant running openWakeWord locally
Python
6
star
20

ru_pocketsphinx-cmu

Russian voice2json profile based on Pocketsphinx
Python
6
star
21

eyecode-tools

A collection of tools for analyzing data from my eyeCode experiment
Python
5
star
22

novice

Special Python image submodule for beginners
Python
5
star
23

en-us_kaldi-zamia

U.S. English voice2json profile based on Kaldi
Python
5
star
24

en-us_pocketsphinx-cmu

U.S. English voice2json profile based on Pocketsphinx
Python
5
star
25

de_deepspeech-aashishag

German profile using Mozilla's DeepSpeech and Aashishag Model
Python
5
star
26

el-gr_pocketsphinx-cmu

Greek voice2json profile based on Pocketsphinx
Python
5
star
27

mnemofy

Python utility to convert between words and mnemonic numbers
Python
4
star
28

rhasspy-profiles

Language-specific profiles for Rhasspy Hass.io add-on
Makefile
3
star
29

motion-sensor

Wakes/sleeps a Raspberry Pi display using a PIR sensor
Python
3
star
30

pl_julius-github

Polish voice2json profile based on Julius
Python
3
star
31

de_kaldi-zamia

German voice2json profile based on Kaldi
Python
3
star
32

wav-chunk

Read or write INFO chunks in WAV files
Python
3
star
33

artwork

Some of my art (for some definition of art)
Makefile
3
star
34

fr_kaldi-guyot

French profile for voice2json using Kaldi with Paul Guyot's TDN 250 model
Python
3
star
35

docker-deepvoice3

DeepVoice3 web server with pre-trained English models
Python
2
star
36

rhasspy-asr-kaldi

Automated speech recognition library for Rhasspy using Kaldi
Shell
2
star
37

pt-synesthesiam

CMU Sphinx acoustic model for Portugese (pt-br)
Jupyter Notebook
2
star
38

word2phonemes

Grapheme to phoneme guesser using PyTorch
Python
2
star
39

vi_kaldi-montreal

Vietnamese voice2json profile based on Kaldi
Python
2
star
40

epub3-marytts

MaryTTS voice project builder for pre-aligned EPUB 3 audio e-books
Python
2
star
41

esphome-nabu

C++
2
star
42

nexus

A collection of Cognitive Science experimental games
C#
2
star
43

hi_pocketsphinx-cmu

Hindi voice2json profile based on Pocketsphinx
Python
1
star
44

mycroft-precise-trainer

Text to speech wake word training scripts for Mycroft Precise
Python
1
star
45

sv_kaldi-montreal

Swedish voice2json profile based on Kaldi
Python
1
star
46

public-domain-sounds

Compressed WAV files from Public Domain Sounds
1
star
47

de_pocketsphinx-cmu

German voice2json profile based on Pocketsphinx
Python
1
star
48

pocketsphinx-python

Version of Python Pocketsphinx without sound
Python
1
star
49

es_pocketsphinx-cmu

Spanish voice2json profile based on Pocketsphinx
Python
1
star
50

2014-03-10-uva

Software Carpentry repository for University of Virginia bootcamp
Python
1
star
51

lutz

C++ library to compute Lutz complexity of a graph
C++
1
star
52

coqui-tts-tests

Test sound files for Coqui TTS
HTML
1
star
53

marytts-txt2wav

Command-line utility for text to speech with MaryTTS
Java
1
star
54

nl_kaldi-cgn

Voice2json profile for Dutch based on Kaldi CGN model
Python
1
star
55

rhasspy-nlu

Intent recognition library for Rhasspy
Python
1
star
56

kaldi-docker

Dockerizing a sub-set of Kaldi
Dockerfile
1
star
57

ko-kr_kaldi-montreal

Korean voice2json profile based on Kaldi
Python
1
star
58

ca-es_pocketsphinx-cmu

Catalan voice2json profile based on Pocketsphinx
Python
1
star
59

spatial_entropy

Computes an entropy profile for an image using moving averages
Python
1
star
60

kz_pocketsphinx-cmu

Kazakh voice2json profile based on Pocketsphinx
Python
1
star
61

wav-decoder

Basic WAV file decoder in C++
C++
1
star