• Stars
    star
    664
  • Rank 67,903 (Top 2 %)
  • Language
    Python
  • License
    Other
  • Created over 14 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Code for the Million Song Dataset, the dataset contains metadata and audio analysis for a million tracks, a collaboration between The Echo Nest and LabROSA. See website for details.

MILLION SONG DATASET

http://labrosa.ee.columbia.edu/millionsong/

January 2011


  • The dataset contains the analysis and metadata for a million songs. The goal is to provide a large dataset for researchers to report results on, hence encouraging algorithms that scale to commercial sizes.

  • Most of the information is provided by The Echo Nest. The dataset is the result of a collaboration between The Echo Nest and LabROSA at Columbia University. This project is funded in part by the NSF.

  • Most of the data is licensed the same way as Echo Nest's API.

    For the SecondHandSongs dataset (cover songs), see the webpage:

    http://labrosa.ee.columbia.edu/millionsong/secondhand

    For the musiXmatch dataset (lyrics), see the webpage:

    http://labrosa.ee.columbia.edu/millionsong/musixmatch

    The code is under GNU public license. See LICENSE for details.

  • Most details and instructions on how to get the dataset can be found on the project's website:
    http://labrosa.ee.columbia.edu/millionsong/


If you have any question or comment:

https://groups.google.com/forum/#!forum/millionsongdataset