• Stars
    star
    491
  • Rank 87,650 (Top 2 %)
  • Language
    JavaScript
  • License
    Apache License 2.0
  • Created almost 6 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Mimic Recording Studio is a Docker-based application you can install to record voice samples, which can then be trained into a TTS voice with Mimic2

Mimic Recording Studio

demo

The Mycroft open source Mimic technologies are Text-to-Speech engines which take a piece of written text and convert it into spoken audio. The latest generation of this technology, Mimic 2, uses machine learning techniques to create a model which can speak a specific language, sounding like the voice on which it was trained.

The Mimic Recording Studio simplifies the collection of training data from individuals, each of which can be used to produce a distinct voice for Mimic.

Software Quick Start

Windows self-hosted Quick Start

  • git clone https://github.com/MycroftAI/mimic-recording-studio.git
  • cd mimic-recording-studio
  • start-windows.bat

Linux/Mac self-hosted Quick Start

Install Dependencies

Why docker? To make this super easy to set up and run cross platforms.

Build and Run

  • git clone https://github.com/MycroftAI/mimic-recording-studio.git

  • cd mimic-recording-studio

  • docker-compose up to build and run (Note: You may need to use sudo docker-compose up depending on your distribution)

    Alternatively, you can build and run separately. docker-compose build then docker-compose up

  • In your browser, go to http://localhost:3000

Note: The first execution of docker-compose up will take a while as this command will also build the docker containers. Subsequent executions of docker-compose up should be quicker to boot.

Manual Install, Build and Start

Backend

Dependencies
Build & Run
  • cd backend/
  • pip install -r requirements.txt
  • python run.py

Frontend

Dependencies
Build & Run
  • cd frontend/
  • npm install, alternatively yarn install
  • npm start, alternatively yarn start

Coming soon!

Online, http://mimic.mycroft.ai hosted version requiring zero setup.

Data

Audio Recordings

WAV files

Audio is saved as WAV files to the backend/audio_file/{uuid}/ directory. The backend automatically trims the beginning and ending silence for all WAV files using ffmpeg.

{uuid}-metadata.txt

Metadata is also saved to backend/audio_file/{uuid}/. This file maps the WAV file name to the phrase spoken. This along with the WAV files are what you needed to get started on training Mimic 2.

Corpus

For now, we have an English corpus, english_corpus.csv made available which can be found in backend/prompt/. To use your own corpus follow these steps.

  1. Create a csv file in the same format as english_corpus.csv using tabs (\t) as the delimiter.
  2. Make sure there are no empty lines in the corpus
  3. Add your corpus to the backend/prompt directory.
  4. Change the CORPUS environment variable in docker-compose.yml to your corpus name.

Corpora in other languages

If you wish to develop a corpus in a language other than English, then Mimic Recording Studio can be used to produce voice recordings for TTS voices in additional languages. If you are building a corpus in a language other than English, we encourage you to choose phrases which:

  • occur in natural, everyday speech in the target language
  • have a variety of string lengths
  • cover a wide variety of phonemes (basic sounds)

IMPORTANT: For now, you must reset the sqlite database to use a new corpus. If you've recorded on another corpus and would like to save that data, you can simply rename your sqlite db found in backend/db/ to another name. The backend will detect that mimicstudio.db is not there and create a new one for you. You may continue recording data for your new corpus.

Technologies

Frontend

The web UI is built using Javascript and React and create-react-app as a scaffolding tool. Refer to CRA.md to find out more on how to use create-react-app.

Functions

  • Record and play audio
  • Generate audio visualization
  • Calculate and display metrics

Backend

The web service is built using Python, Flask as the backend framework, gunicorn as a http webserver, and sqlite as the database.

Functions

  • Process audio
  • Serves corpus and metrics data
  • Record info in database
  • Record data to the file system

Docker

Docker is used to containerize both applications. By default, the frontend uses network port 3000 while the backend uses networking port 5000. You can configure these in the docker-compose.yml file.

NOTE: If you are running docker-registry, this runs by default on port 5000, so you will need to change which port you use.

Recording Tips

Creating a voice requires an achievable, but significant effort. An individual will need to record 15,000 - 20,000 phrases. In order to get the best possible Mimic voice, the recordings need to be clean and consistent. To that end, follow these recommendations:

  • Record in a quiet environment with noise-dampening material. If your ears can hear outside noise, so can the microphone. For best results, even the sound of air conditioning blowing through a vent should be avoided. Bare walls create subtle echoes and reverberation. A sound dampening booth is ideal, but you can also create a homemade recording studio using soft materials such as acoustic foam in a closet. Comforters and mattresses can also be used effectively!
  • Speak at a consistent volume and speed. Rushing through the phrases will only result in a lower quality voice.
  • Use a quality microphone. To obtain consistent results, we recommend a headset microphone so your mouth is always the same distance from the mic.
  • Avoid vocal fatigue. Record a maximum of 4 hours a day, taking a break every half hour.
  • Backup your Mimic-Recording-Studio directory on a regular basis to avoid data loss.

Advanced

Query database structure

Mimic-Recording-Studio writes all recordings in a sqlite database file located under /backend/db/. This can be opened with database tools like DBeaver.

The database includes two tables.

database_table_overview

Table "audiomodel"

All recordings are persisted in this table with

  • recording timestamp (created_date)
  • uuid of speaker (matches the filesystem path under /backend/audio_files/id)
  • wav filename in filesystem (audio_id)
  • text of recorded phrase (phrase)

The database can be used to query your recordings.

Here are some example queries:

-- List all recordings
SELECT * FROM audiomodel;

-- Lists recordings from january 2020 order by phrase
SELECT * FROM audiomodel WHERE created_date BETWEEN '2020-01-01' AND '2020-01-31' ORDER BY prompt;

-- Lists number of recordings per day
SELECT DATE(created_date), COUNT(*) AS RecordingsPerDay
FROM audiomodel
GROUP BY DATE(created_date )
ORDER BY DATE(created_date)

-- Shows average text length of recordings
SELECT AVG(LENGTH(prompt)) AS avgLength FROM audiomodel

There are many ways that querying the sqlite database might be useful. For example, looking for recordings in a specific time range might help to remove recordings made in a bad environment.

Table "usermodel"

Mimic-Recording-Studio can be used by more than one speaker using the same sqlite database file.

This tables provides following informations per speaker:

  • Unique identifier of speaker (uuid)
  • Name of speaker (user_name)
  • Newest recorded line number of corpus (prompt_num)
  • Total recording time (total_time_spoken)
  • How many chars have been recorded (len_char_spoken)

These values are used to calculate metrics. For example, the speaking pace may show if the recorded phrase is too fast or slow compared to previous recordings.

Query table "usermodel" to get a list of speakers including uuid and some recording statistics on them.

SELECT user_name AS [name], uuid FROM usermodel;

database_table_usermodel

Modify recorder uuid

The browser used to record your phrases persists the users uuid and name in it's localStorage to keep it synchronous with sqlite and filesystem.

If a problem occurs and your browser looses/changes uuid mapping for Mimic-Recording-Studio you could have difficulties to continue a previous recording session. Then update the following two attributes in localStorage of your browser:

Open Mimic-Recording-Studio in your browser, jump to web-developer options, localStorage and set name and uuid to the original values.

browser_local_storage

After that you should be able to continue your previous recording session without further problems.

Providing your recording to Mycroft for training

We welcome your voice donations to Mycroft for use in Text-to-Speech applications. If you would like to provide your voice recordings, you must license them to us under the Creative Commons CC0 Public Domain license so that we can utilise them in TTS voices - which are derivative works. If you're ready to donate your voice recordings, email us at [email protected].

Contributions

PR's are gladly accepted!

Where to get support and assistance

You can get help and support with Mimic Recording Studio at;

More Repositories

1

mycroft-core

Mycroft Core, the Mycroft Artificial Intelligence platform.
Python
6,472
star
2

mimic3

A fast local neural text to speech engine for Mycroft
Python
1,000
star
3

mycroft-precise

A lightweight, simple-to-use, RNN wake word listener
Python
814
star
4

enclosure-picroft

Mycroft interface for Raspberry Pi environment
Shell
802
star
5

mimic1

Mycroft's TTS engine, based on CMU's Flite (Festival Lite)
C
782
star
6

mycroft-skills

A repository for sharing and collaboration for third-party Mycroft skills development.
HTML
757
star
7

adapt

Adapt Intent Parser
Python
707
star
8

Mycroft-Android

Android companion app, sends commands from your Android device to your Mycroft system and returns the output as speech or other medium to the Android device.
Kotlin
343
star
9

mycroft-gui

The Graphical User Interface used by the Mycroft Mark II and more
C++
165
star
10

padatious

A neural network intent parser
Python
158
star
11

selene-backend

Microservices and web apps to support Mycroft devices
Python
143
star
12

ZZZ-RETIRED__openstt

RETIRED - OpenSTT is now retired. If you would like more information on Mycroft AI's open source STT projects, please visit:
143
star
13

mimic3-voices

Voice models for Mimic 3 text to speech system
HTML
115
star
14

personal-backend

WORK IN PROGRESS: A Flask personal backend alternative for running your own version of https://home.mycroft.ai
Python
114
star
15

hardware-mycroft-mark-II

Mycroft's Mark II Rpi mechanical, electrical and industrial designs
Python
106
star
16

docker-mycroft

Mycroft Development Environment inside Docker!
Dockerfile
98
star
17

hardware-mycroft-mark-1

Open-sourcing our mechanical, electrical and industrial designs
92
star
18

MycroftCore-Android

MycroftCore on Android as a native app
Java
86
star
19

documentation

Mycroft.AI documentation for all public facing technical components.
Python
81
star
20

sonopy

A simple audio feature extraction library
Python
78
star
21

lingua-franca

Mycroft's multilingual text parsing and formatting library
Python
73
star
22

selene-ui

Web applications to support the Mycroft AI project.
TypeScript
48
star
23

Precise-Community-Data

Pre-trained Precise models and training data provided by the Mycroft Community
47
star
24

ZZZ-RETIRED__mycroft-core-documentation

ZZZ ARCHIVED - Documentation for Mycroft Core.
34
star
25

mycroft-skills-kit

Mycroft Skills Kit
Python
29
star
26

installers

Installers and instructions for getting Mycroft working on different equipment, OS platforms and desktops.
Shell
28
star
27

contributors

Contributors building the Mycroft open source project
23
star
28

ZZZ-RETIRED__rpi3-headless-wifi-setup

Retired project, replaced by:
Python
21
star
29

mycroft-dinkum

A consumer ready version of Mycroft specifically for the Mark II.
Python
19
star
30

skill-weather

Mycroft AI official Weather Skill, providing weather conditions and forecasts.
Python
19
star
31

skill-wiki

Query Wikipedia articles
Python
18
star
32

skill-hello-world

Mycroft AI Hello World Skill - use this basic Skill to see how Mycroft AI Skills work.
Python
16
star
33

mimic1-core

Core of the mimic TTS system
C
14
star
34

mycroft-skills-manager

Mycroft Skills Manager
Python
14
star
35

skill-alarm

Mycroft AI official Alarm Skill - Set single and recurring alarms, with a choice of alarm sounds
Python
14
star
36

skill-reminder

Mycroft AI official Reminder Skill - set reminders
Python
14
star
37

skill-singing

Mycroft AI official Singing Skill - Mycroft speaks lyrics to popular songs
Python
14
star
38

mycroft-messagebus-client

Python module for connecting to the mycroft messagebus
Python
13
star
39

skill-installer

Mycroft AI official Skill installation Skill - allowing voice installation of Skills
Python
12
star
40

mycroft-precise-python-experiments

Python Experiments for Mycroft Precise Wake Word Listener
Python
12
star
41

snapcraft-mycroft-core

This project is for building mycroft-core snaps
Jsonnet
12
star
42

pylisten

A simple pyaudio microphone interface
Python
11
star
43

padaos

A rigid, lightweight, dead-simple intent parser
Python
11
star
44

ZZZ-RETIRED__adapt-documentation

Retired repo, formerly was the source of docs shown on https://adapt.mycroft.ai
11
star
45

skill-desktop-launcher

Mycroft AI official Desktop Launcher Skill - launch applications in Linux
Python
11
star
46

skill-volume

Mycroft AI official Volume Skill - control the volume of your Device
Python
10
star
47

precise-data

Binary data used for Mycroft Precise
9
star
48

fallback-duckduckgo

Mycroft AI official Duck Duck Go Skill - used as a fallback if an Utterance can't be matched to an Intent
Python
9
star
49

skill-npr-news

Mycroft AI official News Skill, providing the latest news report from your favorite broadcast.
Python
9
star
50

enclosure-mark1

Replacing the faceplate repo
C++
9
star
51

mycroft-skills-data

Metrics and data relating to Skills built for the Mycroft Core system
8
star
52

ML-Tools

Tools for ML Research
Jupyter Notebook
8
star
53

plugin-tts-mimic3

Text to speech plugin for Mycroft using Mimic 3
Python
8
star
54

skill-stop

Mycroft AI official Stop Skill - stop the actions of a Skill that are in progress
Python
8
star
55

fallback-wolfram-alpha

Mycroft AI official Wolfram Alpha Skill - used as a fallback if an Intent is not matched
Python
8
star
56

skill-audio-record

Mycroft AI official Audio Record Skill - record audio and play it back
Python
8
star
57

ZZZ-RETIRED__mycroft-slackbot

Retired, Mycroft Slack is no longer active
Java
8
star
58

skill-stock

Mycroft AI official Stock Skill - providing current prices of stocks
Python
7
star
59

skill-joke

Mycroft AI official Joke Skill - provide basic jokes
Python
7
star
60

skill-camera

Camera Skill for Mycroft AI
QML
7
star
61

mycroft-timer

Mycroft AI official Timer Skill - set multiple named timers
Python
7
star
62

skill-homescreen

Python
7
star
63

skill-mark-2

Control of the Mycroft Mark 2 enclosure
QML
7
star
64

skill-personal

Mycroft AI official Personality Skill - answers basic personality questions around Mycroft
Python
7
star
65

skill-date-time

Mycroft AI official Date and Time Skill, providing the current time, date and day of week for cities around the world.
Python
6
star
66

mycroft-gui-mark-2

QML
6
star
67

skill-playback-control

Mycroft AI official Playback Control Skill - providing Intents for other Skills to use common playback functionality (via Common Play)
QML
6
star
68

skill-spelling

Mycroft AI spelling Skill
Python
6
star
69

skill-ip

Mycroft AI official IP Skill - find the IP address of your Device
Python
5
star
70

skill-fallback-persona

Mycroft AI official Persona Skill - used as a fallback if the Utterance can't be matched to an Intent
Python
5
star
71

mark-ii-sandbox

Image for the Mark II based on Raspberry Pi OS
Python
5
star
72

skill-configuration

Mycroft AI official Configuration Skill - synchronize settings with home.mycroft.ai
Python
5
star
73

docker-openvpn-client

An OpenVPN client built into a docker container. Allows for attaching other containers to a VPN
Shell
5
star
74

skill-query

Skill Negotiating for the best source for an answer via Common QA
Python
5
star
75

mimic1-packaging

Shell
5
star
76

skill-pairing

Mycroft AI official Pairing Skill - connect your Device to home.mycroft.ai
QML
5
star
77

skill-speak

Mycroft AI official Speak Skill - make Mycroft speak back text
Python
5
star
78

mycroft-wifi-setup

Mycroft WiFi Setup Client
Python
4
star
79

respeaker-dev-filesystem

Development file system for Seeed ReSpeaker Core v2
4
star
80

mimic1-full

Build mimic (without runtime plugins if desired)
C
4
star
81

pako

The universal package manager library
Python
4
star
82

fallback-unknown

Mycroft AI official Unknown Fallback Skill - used if no Intent is matched to an Utterance
Python
4
star
83

arriz

A real-time array visualization tool
Python
4
star
84

skill-release-test

Mycroft AI official Release Test Skill - used during `mycroft-core` release testing
Python
4
star
85

mycroft-mark-1

Mycroft AI official Mark 1 Skill - control the Mark 1 enclosure
Python
4
star
86

rnn-demo

Demo of using various recurrent networks to make streaming predictions
Python
4
star
87

skill-support

Mycroft AI official Support Skill - create information for a support request using voice
Python
3
star
88

mycroft-devices

Shell
3
star
89

skill-send-sms

Python
3
star
90

design

Place to share designy things from Mycroft.
3
star
91

mycroft-core-release

A project to automate the mycroft-core release process.
Python
3
star
92

mimic1-documentation

Documentation for https://github.com/mycroftai/mimic1
3
star
93

skill-naptime

Mycroft AI official Naptime Skill - put Mycroft to sleep for a while
Python
3
star
94

mimic1-english

English language support for the mimic TTS system
C
3
star
95

ZZZ-RETIRED__chatter

A Mycroft AI chatbot solution framework
3
star
96

skill-repeat-interactions

Mycroft AI official Repeat Interaction Skill - repeat recent commands
Python
3
star
97

mark-ii-product

Software packaging for the default Mark II operating system.
C
2
star
98

skill-version-checker

Mycroft AI official Version Checker Skill - check the version of mycroft-core that is installed
Python
2
star
99

skill-dial-call

Python
2
star
100

skill-standard-gui

Handles standard, non-platform-specific GUI activities.
Python
2
star