• Stars
    star
    856
  • Rank 53,268 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 5 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The Implementation of FastSpeech based on pytorch.

FastSpeech-Pytorch

The Implementation of FastSpeech Based on Pytorch.

Update (2020/07/20)

  1. Optimize the training process.
  2. Optimize the implementation of length regulator.
  3. Use the same hyper parameter as FastSpeech2.
  4. The measures of the 1, 2 and 3 make the training process 3 times faster than before.
  5. Better speech quality.

Model

My Blog

Prepare Dataset

  1. Download and extract LJSpeech dataset.
  2. Put LJSpeech dataset in data.
  3. Unzip alignments.zip.
  4. Put Nvidia pretrained waveglow model in the waveglow/pretrained_model and rename as waveglow_256channels.pt;
  5. Run python3 preprocess.py.

Training

Run python3 train.py.

Evaluation

Run python3 eval.py.

Notes

  • In the paper of FastSpeech, authors use pre-trained Transformer-TTS model to provide the target of alignment. I didn't have a well-trained Transformer-TTS model so I use Tacotron2 instead.
  • I use the same hyper-parameter as FastSpeech2.
  • The examples of audio are in sample.
  • pretrained model.

Reference

Repository

Paper

More Repositories

1

FastVocoder

Include Basis-MelGAN, MelGAN, HifiGAN and Multiband-HifiGAN, maybe NHV in the future.
Python
154
star
2

Transformer-TTS

TTS model based on Transformer.
Python
57
star
3

FastSpeech2

The Implementation of FastSpeech2 Based on Pytorch.
Python
52
star
4

CLONE

20
star
5

ConvTasNet4BasisMelGAN

This repo contains conv-tasnet for basis-melgan. If you want to get code of basis-melgan, please refer to FastVocoder.
Python
19
star
6

Tacotron2-Pytorch

follow NVIDIA, simplify it and support data parallel.
Python
13
star
7

Lifelong-Learning-Tacotron2

MultiSpeaker Tacotron2 using LifeLong Learning.
Python
13
star
8

Hackathon-EnglishLearning

Voice Scoring System.
JavaScript
8
star
9

tacotron2.xcmyz

new version of tacotron2 (old version: https://github.com/xcmyz/Tacotron2-Pytorch)
Python
8
star
10

LM-Tacotron2

Tacotron2 Combine with Language Model (BERT).
Python
7
star
11

SpeakerVerification

Speaker Verification (GE2E Loss)
Python
7
star
12

Gobang-AI

A C++ Implementation of Gobang AI.
C++
6
star
13

Forced-Alignment

using montreal-forced-aligner.
Python
2
star
14

bert-race

BERT/ALBERT based model for RACE dataset, support multi-worker, multi-GPU, FP16 and bind CPU.
Python
2
star
15

Calculator

A Calculator implemented in Python.
Python
1
star
16

FaceDetection

Python
1
star
17

VAE-Tacotron

A Pytorch Implementation of Tacotron Combined with VAE
Python
1
star
18

xcmyz

1
star
19

Polynomial-Calculator

基于Python实现的带有图形界面的多项式计算器
Python
1
star
20

AVX-programming

CPU acceleration using AVX (Advanced Vector Extensions)
1
star
21

ExpressionTransformation

prefix expression, infix expression, postfix expression.
Python
1
star