• Stars
    star
    130
  • Rank 268,265 (Top 6 %)
  • Language
    Python
  • License
    MIT License
  • Created over 5 years ago
  • Updated over 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

🎶 dead simple audio classification

pyAudioClassification

Dead simple audio classification

PyPI - Python Version PyPI

Who is this for? 👩‍💻 👨‍💻

People who just want to classify some audio quickly, without having to dive into the world of audio analysis. If you need something a little more involved, check out pyAudioAnalysis or panotti

Quick install

pip install pyaudioclassification

Requirements

  • Python 3
  • Keras
  • Tensorflow
  • librosa
  • NumPy
  • Soundfile
  • tqdm
  • matplotlib

Quick start

from pyaudioclassification import feature_extraction, train, predict
features, labels = feature_extraction(<data_path>)
model = train(features, labels)
pred = predict(model, <data_path>)

Or, if you're feeling reckless, you could just string them together like so:

pred = predict(train(feature_extraction(<training_data_path>)), <prediction_data_path>)

A full example with saving, loading & some dummy data can be found here.


Read below for a more detailed look at each of these calls.

Detailed Guide

Step 1: Preprocessing 🐶 🐱

First, add all your audio files to a directory in the following structure

data/
├── <class_name>/
│   ├── <file_name>
│   └── ...
└── ...

For example, if you were trying to classify dog and cat sounds it might look like this

data/
├── cat/
│   ├── cat1.ogg
│   ├── cat2.ogg
│   ├── cat3.wav
│   └── cat4.wav
└── dog/
    ├── dog1.ogg
    ├── dog2.ogg
    ├── dog3.wav
    └── dog4.wav

Great, now we need to preprocess this data. Just call feature_extraction(<data_path>) and it'll return our input and target data. Something like this:

features, labels = feature_extraction('/Users/mac2015/data/')

(If you don't want to print to stdout, just pass verbose=False as a argument)


Depending on how much data you have, this process could take a while... so it might be a good idea to save. You can save and load with NumPy

np.save('%s.npy' % <file_name>, features)
features = np.load('%s.npy' % <file_name>)

Step 2: Training 💪

Next step is to train your model on the data. You can just call...

model = train(features, labels)

...but depending on your dataset, you might need to play around with some of the hyper-parameters to get the best results.

Options

  • epochs: The number of iterations. Default is 50.

  • lr: Learning rate. Increase to speed up training time, decrease to get more accurate results (if your loss is 'jumping'). Default is 0.01.

  • optimiser: Choose any of these. Default is 'SGD'.

  • print_summary: Prints a summary of the model you'll be training. Default is False.

  • loss_type: Classification type. Default is categorical for >2 classes, and binary otherwise.

You can add any of these as optional arguments, for example train(features, labels, lr=0.05)


Again, you probably want to save your model once it's done training. You can do this with Keras:

from keras.models import load_model

model.save('my_model.h5')
model = load_model('my_model.h5')

Step 3: Prediction 🙏 🙌

Now the fun part- try your trained model on new data!

pred = predict(model, <data_path>)

Your <data_path> should point to a new, untested audio file.

Binary

If you have 2 classes (or if you force selected 'binary' as a type), pred will just be a single number for each file.

The closer it is to 0, the closer the prediction is for the first class, and the closer it is to 1 the closer the prediction is to the second class.

So for our cat/dog example, if it returns 0.2 it's 80% sure the sound is a cat, and if it returns 0.8 it's 80% sure it's a dog.

Categorical

If you have more than 2 classes (or if you force selected 'categorical' as a type), pred will be an array for each sound file.

It'll look something like this

[[1.6454633e-06 3.7017996e-11 9.9999821e-01 1.5900606e-07]]

The index of each item in the array will correspond to the prediction for that class.


You can pretty print the predictions by showing them in a leaderboard, like so:

print_leaderboard(pred, <training_data_path>)

It looks like this:

1. Cow 100.0% (index 2)
2. Rooster 0.0% (index 0)
3. Frog 0.0% (index 3)
4. Pig 0.0% (index 1)

References

More Repositories

1

ace-attorney-reddit-bot

👨🏼‍⚖️ reddit bot that turns comment chains into ace attorney scenes
Python
772
star
2

PSone.css

🎮 PS1 style CSS Framework, inspired by NES.css
HTML
597
star
3

sponsorship_remover

🚫 adblock for in-video sponsorships
JavaScript
114
star
4

sneaker-generator

👟 DCGAN that generates shoes
Python
83
star
5

suspicious-github-themer

expect some strange glances if you use this at work
JavaScript
58
star
6

totally_humans

rnn trained on r/totallynotrobots 🤖
Python
24
star
7

emoji-pasta-rnn

rnn trained on r/emojipasta
Python
12
star
8

findanewyoutuber

🕵️‍♀️ personality based search engine to find youtubers
Vue
11
star
9

easy_nano

🥦 send and receive nano with 2 simple functions
Python
11
star
10

death-grips-lyrics-generator

rnn that generates ride lyrics
Python
5
star
11

lstm-word-level-rnn.js

Tensorflow.js implementation of word level LSTM RNN
JavaScript
5
star
12

sponsorship_remover_private

JavaScript
4
star
13

smash-battle-lobby

HTML
4
star
14

sponsorship_remover_temp_model

3
star
15

vue-nano

QR code nano reader component for Vue.js
JavaScript
2
star
16

4d-predict-bug-repo

Python
1
star
17

98mprice.github.io

HTML
1
star
18

tagshi

Non-traditional Search Engine
Vue
1
star
19

fantano-sentiment-analysis

A RNN that does sentiment analysis on a dataset of ~290 reviews from Anthony Fantano, with their respective scores as the target.
Python
1
star
20

pokemon-generator

generates pokemon by slicing existing sprites together
JavaScript
1
star
21

earcut-3d

Go library for 3D earcut triangulation
Go
1
star
22

seinfeld-rnn

word level rnn that generates seinfeld scripts
Python
1
star
23

vue-cascading-signup

Simple, clean fullscreen cascading Vue.js component. Designed for sign up forms.
Vue
1
star