• Stars
    star
    149
  • Rank 248,619 (Top 5 %)
  • Language
    Jupyter Notebook
  • Created about 8 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Deep Learning experiments for audio classification

deep-listening

Deep learning experiments for audio classification

A full write-up, including technical explanations and design decisions, as well as a summary of results achieved can be found within the associated Project Report.


This project consists of several Jupyter notebooks that implement deep learning audio classifiers.

1-us8k-ffn-extract-explore.ipynb

  • this notebook contains code to extract and visualise audio files from the UrbanSound8K data set
  • the feature extraction process uses audio processing metrics from the librosa library, which reduces each recording to 193 data points
  • as the audio information is highly abstracted, (we can not process successive frames using a receptive field), these features are intended to be fed into a feed-forward neural network (FFN)

2-us8k-ffn-train-predict.ipynb

  • this notebook contains the code to load previously extracted features and feed them into a 3-layer FFN, implemented using Tensorflow and Keras
  • also included is some code to evaluate model performance, and to generate predictions from individual samples, demonstrating how a trained model would be used to identify the nature of live recordings

3-us8k-cnn-extract-train.ipynb

  • this notebook extracts audio features suitable for input into a classic 2-layer Convolutional Neural Network (CNN)
  • much more of the audio data is preserved in this approach, as the saved numpy feature data is over 2GB I haven't included it with this repository, but you can use the code in this notebook to extract it from the original UrbanSound8K data set

4-us8k-cnn-salamon.ipynb

  • this notebook implements an alternative CNN, similar to one described by Salamon and Bello

5-ffbird-cnn.ipynb

  • this notebook uses the Salamon and Bello CNN to process the FreeField1010 data set of field recordings, with the goal of recognising the presence of birdsong.
  • the data set is not part of this repository, so if you want to run this code you'll need to download the data yourself (see instructions in the notebook)

7-us8k-rnn-extract-train.ipynb


Do get in touch if you've any questions, (me @ jaroncollis . com)