• Stars
    star
    141
  • Rank 259,971 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 7 years ago
  • Updated about 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

ディープラーニング声質変換の第1段階モデルの学習コード

Yukarin: train the first stage model for voice conversion

This repository is refactoring the training code for the first stage model of Bcome Yukarin: Convert your voice to favorite voice.

Japanese README

Supported environment

  • Linux OS
  • Python 3.6

Preparation

Installation required libraries

pip install -r requirements.txt

How to run code (preliminary knowledge)

To run a Python script in this repository, you should set the environment variable PYTHONPATH to find the yukarin library. For example, you can run scripts/foo.py with the following command:

PYTHONPATH=`pwd` python scripts/foo.py

Create dataset

Prepare voice data

Put input/target voice data in two directories (ex. input_wav and target_wav). These data should be same file names.

Create acoustic feature

Create input/target acoustic feature files from each voice data.

python scripts/extract_acoustic_feature.py \
    -i './input_wav/*' \
    -o './input_feature/'

python scripts/extract_acoustic_feature.py \
    -i './target_wav/*' \
    -o './target_feature/'

Align data

Align input and target acoustic features in time direction. In the following example, create the alignment data between input_feature and target_feature into aligned_indexes.

python scripts/extract_align_indexes.py \
    -i1 './input_feature/*.npy' \
    -i2 './target_feature/*.npy' \
    -o './aligned_indexes/'

Calculate frequency statistics

Calculate frequency statistics for input and target voice data. Statistics are needed for voice pitch conversion.

python scripts/extract_f0_statistics.py \
    -i './input_feature/*.npy' \
    -o './input_statistics.npy'

python scripts/extract_f0_statistics.py \
    -i './target_feature/*.npy' \
    -o './target_statistics.npy'

Train

Create the training config file config.json

Modify input_glob, target_glob and indexes_glob in sample_config.json, then can train.

Train

python train.py \
    sample_config.json \
    ./model_stage1/

Test

Put the test input voice data in a directory (ex. test_wav), and run voice_change.py.

python scripts/voice_change.py \
    --model_dir './model_stage1' \
    --config_path './model_stage1/config.json' \
    --input_statistics 'input_statistics.npy' \
    --target_statistics 'target_statistics.npy' \
    --output_sampling_rate 24000 \
    --disable_dataset_test \
    --test_wave_dir './test_wav/' \
    --output_dir './output/'

Advanced: with second stage model

Become Yukarin's Second Stage Model can improve the quality of the converted voice.

Train

Train the second stage model referring to Second Stage Model in Become Yukarin.

Test

Put the test input voice data in a directory (ex. test_wav), and run voice_change_with_second_stage.py.

python scripts/voice_change_with_second_stage.py \
    --voice_changer_model_dir './model_stage1' \
    --voice_changer_config './model_stage1/config.json' \
    --super_resolution_model './model_stage2/' \
    --super_resolution_config './model_stage2/config.json' \
    --input_statistics 'input_statistics.npy' \
    --target_statistics 'target_statistics.npy' \
    --out_sampling_rate 24000 \
    --disable_dataset_test \
    --dataset_target_wave_dir '' \
    --test_wave_dir './test_wav' \
    --output_dir './output/'

License

MIT License

More Repositories

1

become-yukarin

Convert your voice to favorite voice
Python
571
star
2

realtime-yukarin

An application for real-time voice conversion
Python
330
star
3

pytorch-trainer

PyTorch's Trainer like Chainer's Trainer
Python
46
star
4

jvs_hiho

JVS (Japanese versatile speech) コーパスの自作のラベル
Shell
31
star
5

vv_core_inference

VOICEVOXのコア内で用いられているディープラーニングモデルの推論コード
Python
27
star
6

yukarin_autoreg

Python
27
star
7

hihobot

自分のチャットボットを作る
Python
23
star
8

hihobot-synthesis

自分の声で音声合成
Python
16
star
9

openjtalk-label-getter

Python
10
star
10

kiritan_singing_label_reader

The reader for 東北きりたん歌唱データベース's label data in python.
Python
8
star
11

commecomme

ニコニコのコメントなどを画面上に表示するツール
JavaScript
6
star
12

girl_friend_factory

JavaScript
6
star
13

acoustic_feature_extractor

Python
6
star
14

yukarin_soso_connector

Python
5
star
15

hihobot-tts

自分のように対話し、自分の声で音声合成するライブラリのWebAPI化する
Python
4
star
16

iOS-Flat-UI-Libraries

The flat UI Libraries for iOS, which I collect.
3
star
17

hihobot-front

自分と音声会話するWebアプリ
JavaScript
3
star
18

voiceactress100_ruby

読み仮名(ルビ)つき声優統計コーパス音素バランス文
HTML
2
star
19

accent_estimator

Python
2
star
20

hiho-gcp

2
star
21

yukarin_nsf

Python
2
star
22

temp_cache

simply python3 library for creating temporary cache file library
Python
2
star
23

voice_encoder

Python
2
star
24

yukarin_wavegrad

Python
2
star
25

yukarin-tts-software

TypeScript
1
star
26

blog

1
star
27

yukari_direct

誰でも結月ゆかりになれるwebサービス
JavaScript
1
star
28

yukari_direct_server

Python
1
star
29

yukarin_sos

Python
1
star
30

ita_corpus_hiho

Python
1
star
31

hiroshiba_mastodon_bot

Python
1
star
32

yukarin_sosf

Python
1
star
33

paint_transfer_c92

Python
1
star
34

hiho-config

hiho's configs
1
star
35

yukarin_soso_orchestra

1
star
36

yukarin_tts_software_engine

Python
1
star
37

yukarin_wavernn

Python
1
star
38

yukarin_soso

Python
1
star
39

tornado_instant_webapi

Library for automatically generating web API from Python object based on Tornado.
Python
1
star
40

nicolive-mastodon

マストドンのトゥートをHTML5コメントジェネレーターに流し込むツール
Python
1
star
41

jvs_metadata_loader

Metadata loader for JVS (Japanese versatile speech) corpus.
Python
1
star
42

hiho_check_src

privリポジトリを持ってくるコード、こっちがpublic
Shell
1
star
43

signico_real_to_anime

2種類の画像を相互変換する
Python
1
star
44

voicevox_overview

VOICEVOXの全体像の概要
1
star
45

check_diffusion_sine

diffusionベースでサイン波を作ったりするチェック用のコード
Jupyter Notebook
1
star
46

yukarin_so

Python
1
star