• Stars
    star
    106
  • Rank 323,936 (Top 7 %)
  • Language
    Python
  • License
    MIT License
  • Created over 5 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Variational auto-encoders for audio

UPDATE (20.5.20): I decided to isolate the code for reproducing the paper Learning Disentangled Representations of Timbre and Pitch for Musical Instrument Sounds Using Gaussian Mixture Variational Autoencoders (up from here) from this repo.

vae-audio

For variational auto-encoders (VAEs) and audio/music lovers, based on PyTorch.

Overview

The repo is under construction.

The project is built to facillitate research on using VAEs to model audio. It provides

  • vanilla VAE
  • Gaussian mixture VAE
  • vector-quantized VAE
  • customizable model options
  • audio feature extracton
  • model testing and latent space visualization
  • end-to-end audio feature extraction and model training
  • higher-level wrappers for easier use
  • easier installation
  • documentation

The project structure is based on PyTorch Template.

Requirements

  • torch 1.1.0
  • librosa 0.6.3

Usage

Audio Feature Extraction

  1. Define customized Dataset classes in dataset/datasets.py
  2. Run python dataset/audio_transform.py -c your_config_of_audio_transform.json to compute audio features (e.g., spectrograms)
  3. Define customized DataLoader classes in data_loader/data_loaders.py

Model Training

Run python train.py -c your_config_of_model_train.json

To Be Continued