• Stars
    star
    45
  • Rank 624,037 (Top 13 %)
  • Language
    Python
  • License
    MIT License
  • Created over 3 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

PyTorch Implementation of Google's Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions. This implementation supports both single-, multi-speaker TTS and several techniques to enforce the robustness and efficiency of the model.

More Repositories

1

PortaSpeech

PyTorch Implementation of PortaSpeech: Portable and High-Quality Generative Text-to-Speech
Python
329
star
2

Comprehensive-Transformer-TTS

A Non-Autoregressive Transformer based Text-to-Speech, supporting a family of SOTA transformers with supervised and unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate TTS
Python
319
star
3

DiffGAN-TTS

PyTorch Implementation of DiffGAN-TTS: High-Fidelity and Efficient Text-to-Speech with Denoising Diffusion GANs
Python
310
star
4

Expressive-FastSpeech2

PyTorch Implementation of Non-autoregressive Expressive (emotional, conversational) TTS based on FastSpeech2, supporting English, Korean, and your own languages.
Python
276
star
5

DiffSinger

PyTorch implementation of DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (focused on DiffSpeech)
Python
233
star
6

DailyTalk

Official repository of DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech, ICASSP 2023
Python
194
star
7

StyleSpeech

PyTorch Implementation of Meta-StyleSpeech : Multi-Speaker Adaptive Text-to-Speech Generation
Python
189
star
8

Parallel-Tacotron2

PyTorch Implementation of Google's Parallel Tacotron 2: A Non-Autoregressive Neural TTS Model with Differentiable Duration Modeling
Python
187
star
9

Cross-Speaker-Emotion-Transfer

PyTorch Implementation of ByteDance's Cross-speaker Emotion Transfer Based on Speaker Condition Layer Normalization and Semi-Supervised Training in Text-To-Speech
Python
180
star
10

STYLER

Official repository of STYLER: Style Factor Modeling with Rapidity and Robustness via Speech Decomposition for Expressive and Controllable Neural Text to Speech, INTERSPEECH 2021
Python
156
star
11

Comprehensive-E2E-TTS

A Non-Autoregressive End-to-End Text-to-Speech (text-to-wav), supporting a family of SOTA unsupervised duration modelings. This project grows with the research community, aiming to achieve the ultimate E2E-TTS
Python
144
star
12

Soft-DTW-Loss

PyTorch implementation of Soft-DTW: a Differentiable Loss Function for Time-Series in CUDA
Python
120
star
13

VAENAR-TTS

PyTorch Implementation of VAENAR-TTS: Variational Auto-Encoder based Non-AutoRegressive Text-to-Speech Synthesis.
Python
71
star
14

FastPitchFormant

PyTorch Implementation of NCSOFT's FastPitchFormant: Source-filter based Decomposed Modeling for Speech Synthesis
Python
71
star
15

WaveGrad2

PyTorch Implementation of Google Brain's WaveGrad 2: Iterative Refinement for Text-to-Speech Synthesis
Python
66
star
16

Daft-Exprt

PyTorch Implementation of Daft-Exprt: Robust Prosody Transfer Across Speakers for Expressive Speech Synthesis
Python
56
star
17

evaluate-zero-shot-tts

Evaluation Protocol for Large-Scale Zero-Shot TTS Literature
Python
43
star
18

Robust_Fine_Grained_Prosody_Control

PyTorch Implementation of Robust and fine-grained prosody control of end-to-end speech synthesis
Python
41
star
19

Stepwise_Monotonic_Multihead_Attention

PyTorch Implementation of Stepwise Monotonic Multihead Attention similar to Enhancing Monotonicity for Robust Autoregressive Transformer TTS
Python
31
star
20

Deep-Learning-TTS-Template

This is a template for the Non-autoregressive Deep Learning-Based TTS model (in PyTorch).
Python
15
star
21

tacotron2_MMI

Another PyTorch implementation of Tacotron2 MMI (with waveglow) which supports n_frames_per_step>1 mode(reduction windows) and diagonal guided attention for robust alignments.
Jupyter Notebook
5
star
22

Fully_Hierarchical_Fine_Grained_TTS

Pytorch Implementation of Fully-hierarchical fine-grained prosody modeling for interpretable speech synthesis (Unofficial)
2
star
23

cs231n

cs231n 2020 Spring assignments implementation
Jupyter Notebook
2
star
24

pintos

KAIST CS330 OS pintos Project
HTML
1
star