• This repository has been archived on 28/May/2019
  • Stars
    star
    872
  • Rank 50,274 (Top 2 %)
  • Language
    Python
  • License
    Other
  • Created over 6 years ago
  • Updated about 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A method to generate speech across multiple speakers

VoiceLoop

PyTorch implementation of the method described in the paper VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop.

VoiceLoop is a neural text-to-speech (TTS) that is able to transform text to speech in voices that are sampled in the wild. Some demo samples can be found here.

Quick Links

Quick Start

Follow the instructions in Setup and then simply execute:

python generate.py  --npz data/vctk/numpy_features_valid/p318_212.npz --spkr 13 --checkpoint models/vctk/bestmodel.pth

Results will be placed in models/vctk/results. It will generate 2 samples:

You can also generate the same text but with a different speaker, specifically:

python generate.py  --npz data/vctk/numpy_features_valid/p318_212.npz --spkr 18 --checkpoint models/vctk/bestmodel.pth

Which will generate the following sample.

Here is the corresponding attention plot:

Legend: X-axis is output time (acoustic samples) Y-axis is input (text/phonemes). Left figure is speaker 10, right is speaker 14.

Finally, free text is also supported:

python generate.py  --text "hello world" --spkr 1 --checkpoint models/vctk/bestmodel.pth

Setup

Requirements: Linux/OSX, Python2.7 and PyTorch 0.1.12. Generation requires installing phonemizer, follow the setup instructions there. The current version of the code requires CUDA support for training. Generation can be done on the CPU.

git clone https://github.com/facebookresearch/loop.git
cd loop
pip install -r scripts/requirements.txt

Data

The data used to train the models in the paper can be downloaded via:

bash scripts/download_data.sh

The script downloads and preprocesses a subset of VCTK. This subset contains speakers with american accent.

The dataset was preprocessed using Merlin - from each audio clip we extracted vocoder features using the WORLD vocoder. After downloading, the dataset will be located under subfolder data as follows:

loop
β”œβ”€β”€ data
 Β Β  └── vctk
  Β Β  Β Β  β”œβ”€β”€ norm_info
        β”‚   β”œβ”€β”€ norm.dat
     Β Β  β”œβ”€β”€ numpy_feautres
        β”‚   β”œβ”€β”€ p294_001.npz
        β”‚   β”œβ”€β”€ p294_002.npz
        β”‚   └── ...
     Β Β  └── numpy_features_valid

The preprocess pipeline can be executed using the following script by Kyle Kastner: https://gist.github.com/kastnerkyle/cc0ac48d34860c5bb3f9112f4d9a0300.

Pretrained Models

Pretrainde models can be downloaded via:

bash scripts/download_models.sh

After downloading, the models will be located under subfolder models as follows:

loop
β”œβ”€β”€ data
β”œβ”€β”€ models
    β”œβ”€β”€ blizzard
    β”œβ”€β”€ vctk
    β”‚   β”œβ”€β”€ args.pth
  Β Β β”‚   └── bestmodel.pth
    └── vctk_alt

Update 10/25/2017: Single speaker model available in models/blizzard/

SPTK and WORLD

Finally, speech generation requires SPTK3.9 and WORLD vocoder as done in Merlin. To download the executables:

bash scripts/download_tools.sh

Which results the following sub directories:

loop
β”œβ”€β”€ data
β”œβ”€β”€ models
β”œβ”€β”€ tools
  Β Β β”œβ”€β”€ SPTK-3.9
 Β Β  └── WORLD

Training

Single-Speaker

Single speaker model is trained on blizzard 2011. Data should be downloaded and prepared as described above. Once the data is ready, run:

python train.py --noise 1 --expName blizzard_init --seq-len 1600 --max-seq-len 1600 --data data/blizzard --nspk 1 --lr 1e-5 --epochs 10

Then, continue training the model with :

python train.py --noise 1 --expName blizzard --seq-len 1600 --max-seq-len 1600 --data data/blizzard --nspk 1 --lr 1e-4 --checkpoint checkpoints/blizzard_init/bestmodel.pth --epochs 90

Multi-Speaker

Training a new model on vctk, first train the model using noise level of 4 and input sequence length of 100:

python train.py --expName vctk --data data/vctk --noise 4 --seq-len 100 --epochs 90

Then, continue training the model using noise level of 2, on full sequences:

python train.py --expName vctk_noise_2 --data data/vctk --checkpoint checkpoints/vctk/bestmodel.pth --noise 2 --seq-len 1000 --epochs 90

Citation

If you find this code useful in your research then please cite:

@article{taigman2017voice,
  title           = {VoiceLoop: Voice Fitting and Synthesis via a Phonological Loop},
  author          = {Taigman, Yaniv and Wolf, Lior and Polyak, Adam and Nachmani, Eliya},
  journal         = {ArXiv e-prints},
  archivePrefix   = "arXiv",
  eprinttype      = {arxiv},
  eprint          = {1705.03122},
  primaryClass    = "cs.CL",
  year            = {2017}
  month           = October,
}

License

Loop has a CC-BY-NC license.

More Repositories

1

draft-js

A React framework for building text editors.
JavaScript
22,506
star
2

pop

An extensible iOS and OS X animation library, useful for physics-based interactions.
Objective-C++
19,716
star
3

flux

Application Architecture for Building User Interfaces
JavaScript
17,397
star
4

prepack

A JavaScript bundle optimizer.
JavaScript
14,271
star
5

AsyncDisplayKit

Smooth asynchronous user interfaces for iOS apps.
Objective-C++
13,447
star
6

stetho

Stetho is a debug bridge for Android applications, enabling the powerful Chrome Developer Tools and much more.
Java
12,653
star
7

Shimmer

An easy way to add a simple, shimmering effect to any view in an iOS app.
Objective-C
9,375
star
8

react-360

Create amazing 360 and VR content using React
JavaScript
8,702
star
9

caffe2

Caffe2 is a lightweight, modular, and scalable deep learning framework.
Shell
8,420
star
10

nuclide

An open IDE for web and native mobile development, built on top of Atom
JavaScript
7,816
star
11

KVOController

Simple, modern, thread-safe key-value observing for iOS and OS X.
Objective-C
7,359
star
12

three20

Three20 is an Objective-C library for iPhone developers
Objective-C
7,265
star
13

xctool

An extension for Apple's xcodebuild that makes it easier to test iOS and macOS apps.
Objective-C
6,954
star
14

fbctf

Platform to host Capture the Flag competitions
Hack
6,495
star
15

rebound

A Java library that models spring dynamics and adds real world physics to your app.
Java
5,444
star
16

Keyframes

A library for converting Adobe AE shape based animations to a data format and playing it back on Android and iOS devices.
JavaScript
5,343
star
17

shimmer-android

An easy, flexible way to add a shimmering effect to any view in an Android app.
Java
5,265
star
18

grace

Graceful restart & zero downtime deploy for Go servers.
Go
4,899
star
19

Tweaks

An easy way to fine-tune, and adjust parameters for iOS apps in development.
Objective-C
4,751
star
20

augmented-traffic-control

Augmented Traffic Control: A tool to simulate network conditions
Python
4,331
star
21

fixed-data-table

A React table component designed to allow presenting thousands of rows of data.
JavaScript
4,314
star
22

WebDriverAgent

A WebDriver server for iOS that runs inside the Simulator.
Objective-C
4,096
star
23

huxley

A testing system for catching visual regressions in Web applications.
Python
4,086
star
24

codemod

Codemod is a tool/library to assist you with large-scale codebase refactors that can be partially automated but still require human oversight and occasional intervention. Codemod was developed at Facebook and released as open source.
Python
4,069
star
25

scribe

Scribe is a server for aggregating log data streamed in real time from a large number of servers.
C++
3,932
star
26

FBMemoryProfiler

iOS tool that helps with profiling iOS Memory usage.
Objective-C
3,417
star
27

mention-bot

Automatically mention potential reviewers on pull requests.
JavaScript
3,371
star
28

facebook-php-sdk

This SDK is deprecated. Find the new SDK here: https://github.com/facebook/facebook-php-sdk-v4
PHP
3,289
star
29

origami

A Quartz Composer framework that enables interactive design prototyping without programming.
Objective-C
3,280
star
30

RakNet

RakNet is a cross platform, open source, C++ networking engine for game programmers.
HTML
3,211
star
31

network-connection-class

Listen to current network traffic in the app and categorize the quality of the network.
Java
3,178
star
32

beringei

Beringei is a high performance, in-memory storage engine for time series data.
C++
3,159
star
33

php-graph-sdk

The Facebook SDK for PHP provides a native interface to the Graph API and Facebook Login. https://developers.facebook.com/docs/php
PHP
3,146
star
34

react-native-fbsdk

A React Native wrapper around the Facebook SDKs for Android and iOS. Provides access to Facebook login, sharing, graph requests, app events etc.
Java
2,993
star
35

python-instagram

Python Client for Instagram API
Python
2,966
star
36

conceal

Conceal provides easy Android APIs for performing fast encryption and authentication of data.
C++
2,966
star
37

webscalesql-5.6

WebScaleSQL, Version 5.6, based upon the MySQL-5.6 community releases.
C++
2,954
star
38

ios-snapshot-test-case

Snapshot view unit tests for iOS
Objective-C
2,674
star
39

device-year-class

A library that analyzes an Android device's specifications and calculates which year the device would be considered "high end”.
Java
2,581
star
40

BOLT

Binary Optimization and Layout Tool - A linux command-line utility used for optimizing performance of binaries
2,497
star
41

pfff

Tools for code analysis, visualizations, or style-preserving source transformation.
OCaml
2,439
star
42

fb.resnet.torch

Torch implementation of ResNet from http://arxiv.org/abs/1512.03385 and training scripts
Lua
2,243
star
43

redux-react-hook

React Hook for accessing state and dispatch from a Redux store
TypeScript
2,164
star
44

Surround360

Surround360 is Facebook's open source hardware and software for capturing stereoscopic 3D 360 video for VR. The repo contains hardware designs, as well as software for camera control and rendering.
C++
2,153
star
45

xcbuild

Xcode-compatible build tool.
C++
2,000
star
46

LogDevice

Distributed storage for sequential data
C++
1,888
star
47

MemNN

Memory Networks implementations
Lua
1,757
star
48

rebound-js

Spring dynamics in JavaScript.
JavaScript
1,754
star
49

redis-faina

A query analyzer that parses Redis' MONITOR command for counter/timing stats about query patterns
Python
1,749
star
50

fb-flo

A Chrome extension that lets you modify running apps without reloading them.
JavaScript
1,692
star
51

planout

PlanOut is a library and interpreter for designing online experiments.
JavaScript
1,664
star
52

libphenom

An eventing framework for building high performance and high scalability systems in C.
C
1,662
star
53

flashcache

A general purpose, write-back block cache for Linux.
C
1,601
star
54

python-nubia

A command-line and interactive shell framework.
Python
1,595
star
55

profilo

A library for performance traces from production.
C
1,577
star
56

facebook-swift-sdk

Integrate your iOS apps in Swift with Facebook Platform.
Swift
1,519
star
57

instagram-ruby-gem

The official gem for the Instagram API
Ruby
1,461
star
58

inject

Package inject provides a reflect based injector.
Go
1,393
star
59

Flicks

A unit of time defined in C++.
C++
1,388
star
60

duckling_old

Deprecated in favor of https://github.com/facebook/duckling
Clojure
1,322
star
61

connect-js

Legacy JavaScript SDK
JavaScript
1,237
star
62

atom-in-orbit

Putting Atom in the browser
JavaScript
1,183
star
63

phpsh

A read-eval-print-loop for php
Emacs Lisp
1,160
star
64

C3D

C3D is a modified version of BVLC caffe to support 3D ConvNets.
Jupyter Notebook
1,159
star
65

sublime-react

Sublime Text helpers for React. Syntax highlighting DEPRECATED in favor of babel/babel-sublime
JavaScript
1,144
star
66

fb-adb

A better shell for Android devices
C
1,139
star
67

iTorch

IPython kernel for Torch with visualization and plotting
Jupyter Notebook
1,104
star
68

FBAllocationTracker

iOS library that helps tracking all allocated Objective-C objects
Objective-C++
1,094
star
69

fbcunn

Facebook's extensions to torch/cunn.
Lua
1,069
star
70

emitter

A JS EventEmitter foundation for evented code
JavaScript
1,041
star
71

bistro

Bistro is a flexible distributed scheduler, a high-performance framework supporting multiple paradigms while retaining ease of configuration, management, and monitoring.
C++
1,040
star
72

relay-starter-kit

Barebones starting point for a Relay application.
JavaScript
1,017
star
73

torchnet

Torch on steroids
Lua
992
star
74

react-meteor

React rendering for Meteor apps
JavaScript
953
star
75

atom-ide-ui

A collection of user interfaces for Atom IDE.
JavaScript
936
star
76

NAMAS

Neural Attention Model for Abstractive Summarization
Lua
910
star
77

nifty

Thrift on Netty
Java
899
star
78

swift

An annotation-based Java library for creating Thrift serializable types and services.
Java
889
star
79

bAbI-tasks

Task generation for testing text understanding and reasoning
Lua
886
star
80

hadoop-20

Facebook's Realtime Distributed FS based on Apache Hadoop 0.20-append
Java
876
star
81

IGInterfaceDataTable

A category on WKInterfaceTable that makes configuring tables with multi-dimensional data easier.
Objective-C
837
star
82

mononoke

A Mercurial source control server, specifically designed to support large monorepos.
822
star
83

react-page

Easy Application Development with React JavaScript
JavaScript
795
star
84

f8DeveloperConferenceApp

[Archive] f8 2014 Conference App
HTML
761
star
85

nailgun

Nailgun is a client, protocol, and server for running Java programs from the command line without incurring the JVM startup overhead.
Java
734
star
86

WEASEL

DNS covert channel implant for Red Teams.
Python
725
star
87

RiftDK1

Firmware, Schematics, and Mechanicals for the Oculus Rift Development Kit 1
C
688
star
88

jcommon

concurrency, collections, stats/analytics, config, testing, etc
Java
677
star
89

proguard

A fork of ProGuard.
Java
661
star
90

bootstrapped

Generate bootstrapped confidence intervals for A/B testing in Python.
Python
631
star
91

ig-lazy-module-loader

Library that implements module lazy loading.
Java
630
star
92

opencompute

A community of engineers whose mission is to design and enable the delivery of the most efficient server, storage and data center hardware designs for scalable computing.
TeX
624
star
93

flint

An open-source lint program for C++ developed by, and formerly used at Facebook.
D
622
star
94

fblualib

Facebook libraries and utilities for Lua
Lua
615
star
95

remodel

Remodel is a tool that helps iOS and OS X developers avoid repetitive code by generating Objective-C models that support coding, value comparison, and immutability.
TypeScript
609
star
96

eyescream

natural image generation using ConvNets
Lua
599
star
97

react-python

Python bridge to JSX & the React JavaScript library.
Python
576
star
98

spacetime

Experimental iOS library for live transformations on parts of layers.
Objective-C
528
star
99

warp

A fast preprocessor for C and C++
D
521
star
100

FBNotifications

Facebook Analytics In-App Notifications Framework
Objective-C
494
star