• Stars
    star
    139
  • Rank 262,954 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created about 3 years ago
  • Updated over 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

PyTorch Implementation of Multi-Singer (ACM-MM'21)

Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus

PyTorch Implementation of (ACM MM'21)Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus.

arXiv GitHub Stars MIT License

Requirements

See requirements in requirement.txt:

  • linux
  • python 3.6
  • pytorch 1.0+
  • librosa
  • json, tqdm, logging

Getting started

Apply recipe to your own dataset

  • Put any wav files in data directory
  • Edit configuration in config/config.yaml

1. Pretrain

Use our checkpoint, or
you can also train the encoder on your own here, and set the enc_model_fpath in config/config.yaml. Please set params as those in encoder/params_data and encoder/params_model.

2. Preprocess

Extract mel-spectrogram

python preprocess.py -i data/wavs -o data/feature -c config/config.yaml

-i your audio folder

-o output acoustic feature folder

-c config file

3. Train

Training conditioned on mel-spectrogram

python train.py -i data/feature -o checkpoints/ --config config/config.yaml

-i acoustic feature folder

-o directory to save checkpoints

-c config file

4. Inference

python inference.py -i data/feature -o outputs/  -c checkpoints/*.pkl -g config/config.yaml

-i acoustic feature folder

-o directory to save generated speech

-c checkpoints file

-c config file

5. Singing Voice Synthesis

For Singing Voice Synthesis:

  • Take modified FastSpeech 2 for mel-spectrogram synthesis
  • Use synthesized mel-spectrogram in Multi-Singer for waveform synthesis.

Checkpoint

Trained on OpenSinger

Acknowledgements

GE2E
FastSpeech 2
Parallel WaveGAN

Citation

@inproceedings{huang2021multi,
  title={Multi-Singer: Fast Multi-Singer Singing Voice Vocoder With A Large-Scale Corpus},
  author={Huang, Rongjie and Chen, Feiyang and Ren, Yi and Liu, Jinglin and Cui, Chenye and Zhao, Zhou},
  booktitle={Proceedings of the 29th ACM International Conference on Multimedia},
  pages={3945--3954},
  year={2021}
}

Question

Feel free to contact me at [email protected]