Multitrack Music Transformer

This repository contains the official implementation of "Multitrack Music Transformer" (ICASSP 2023).

Multitrack Music Transformer
Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley and Taylor Berg-Kirkpatrick
IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023
[homepage] [paper] [code] [reviews]

Content

Prerequisites
Preprocessing
- Preprocessed Datasets
- Preprocessing Scripts
Training
- Pretrained Models
- Training Scripts
Evaluation
Generation (Inference)
Citation

Prerequisites

We recommend using Conda. You can create the environment with the following command.

conda env create -f environment.yml

Preprocessing

Preprocessed Datasets

The preprocessed datasets can be found here. You can use gdown to download them via command line as follows.

gdown --id 1owWu-Ne8wDoBYCFiF9z11fruJo62m_uK --folder

Extract the files to data/{DATASET_KEY}/processed/json and data/{DATASET_KEY}/processed/notes, where DATASET_KEY is sod, lmd, lmd_full or snd.

Preprocessing Scripts

You can skip this section if you download the preprocessed datasets.

Step 1 -- Download the datasets

Please download the Symbolic orchestral database (SOD). You may download it via command line as follows.

wget https://qsdfo.github.io/LOP/database/SOD.zip

We also support the following two datasets:

Lakh MIDI Dataset (LMD):

wget http://hog.ee.columbia.edu/craffel/lmd/lmd_full.tar.gz

SymphonyNet Dataset:

gdown https://drive.google.com/u/0/uc?id=1j9Pvtzaq8k_QIPs8e2ikvCR-BusPluTb&export=download

Step 2 -- Prepare the name list

Get a list of filenames for each dataset.

find data/sod/SOD -type f -name *.mid -o -name *.xml | cut -c 14- > data/sod/original-names.txt

Note: Change the number in the cut command for different datasets.

Step 3 -- Convert the data

Convert the MIDI and MusicXML files into MusPy files for processing.

python convert_sod.py

Note: You may enable multiprocessing with the -j option, for example, python convert_sod.py -j 10 for 10 parallel jobs.

Step 4 -- Extract the note list

Extract a list of notes from the MusPy JSON files.

python extract.py -d sod

Step 5 -- Split training/validation/test sets

Split the processed data into training, validation and test sets.

python split.py -d sod

Training

Pretrained Models

The pretrained models can be found here. You can use [gdown] to download all the pretrained models via command line as follows.

gdown --id 1HoKfghXOmiqi028oc_Wv0m2IlLdcJglQ --folder

Training Scripts

Train a Multitrack Music Transformer model.

Absolute positional embedding (APE):

python mmt/train.py -d sod -o exp/sod/ape -g 0
Relative positional embedding (RPE):

python mmt/train.py -d sod -o exp/sod/rpe --no-abs_pos_emb --rel_pos_emb -g 0
No positional embedding (NPE):

python mmt/train.py -d sod -o exp/sod/npe --no-abs_pos_emb --no-rel_pos_emb -g 0

Generation (Inference)

Generate new samples using a trained model.

python mmt/generate.py -d sod -o exp/sod/ape -g 0

Evaluation

Evaluate the trained model using objective evaluation metrics.

python mmt/evaluate.py -d sod -o exp/sod/ape -ns 100 -g 0

Acknowledgment

The code is based largely on the x-transformers library developed by lucidrains.

Citation

Please cite the following paper if you use the code provided in this repository.

Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian McAuley and Taylor Berg-Kirkpatrick, "Multitrack Music Transformer," IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2023.

@inproceedings{dong2023mmt,
    author = {Hao-Wen Dong and Ke Chen and Shlomo Dubnov and Julian McAuley and Taylor Berg-Kirkpatrick},
    title = {Multitrack Music Transformer},
    booktitle = {IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
    year = 2023,
}

salu133445/mmt

salu133445

Reviews

Repository Details