UPDATE (20.5.20): I decided to isolate the code for reproducing the paper Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders (up from here) from this repo.
For variational auto-encoders (VAEs) and audio/music lovers, based on PyTorch.
The repo is under construction.
The project is built to facillitate research on using VAEs to model audio. It provides
- vanilla VAE
- Gaussian mixture VAE
- vector-quantized VAE
- customizable model options
- audio feature extracton
- model testing and latent space visualization
- end-to-end audio feature extraction and model training
- higher-level wrappers for easier use
- easier installation
- documentation
The project structure is based on PyTorch Template.
- torch 1.1.0
- librosa 0.6.3
- Define customized
Dataset
classes indataset/datasets.py
- Run
python dataset/audio_transform.py -c your_config_of_audio_transform.json
to compute audio features (e.g., spectrograms) - Define customized
DataLoader
classes indata_loader/data_loaders.py
Run python train.py -c your_config_of_model_train.json