• Stars
    star
    422
  • Rank 102,128 (Top 3 %)
  • Language
    Jupyter Notebook
  • License
    Other
  • Created over 8 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Collection of notebooks and scripts related to audio processing and machine learning.

Audio Notebooks

A collection of Jupyter Notebooks related to audio processing.

The notebooks act like interactive utility scripts for converting between different representations, usually stored in data/project/ where project is the dataset you're working with. Generally, if you change data_root near the top of the notebook and run the reset of the notebook, it will do something useful.

Setup

librosa currently needs some extra help on OS X, make sure to follow the instructions here first.

$ brew install ffmpeg # for loading and saving audio
$ git clone https://github.com/kylemcdonald/AudioNotebooks.git
$ cd AudioNotebooks.git
$ pip install -r requirements.txt
$ jupyter notebook

Terminology

Here are some words used in the names of the notebooks, and what they mean:

  • Samples refers to one-shot sounds, usually less than 1-2 seconds long. These can be loaded from a directory, like data/project/samples/ or from a precomputed numpy matrix like data/project/samples.npy. When they are stored in a .npy file, all the samples are necessarily concatenated or expanded to be the same length.
  • Multisamples refers to audio that needs to be segmented into samples.
  • Fingerprints refer to small images, usually 32x32 pixels, representing a small chunk of time like 250ms or 500ms. These are either calculated with CQT, STFT, or another frequency domain analysis technique. They are useful for running t-SNE or training neural nets.
  • Spritesheets are single files with multiple sounds, either visually as fingerprints or sonically as a sequence of sounds, organized carefully so they can be chopped up again later.

Some formats in use:

  • .npy are numpy matrices. Numpy can load and save these very quickly, even for large datasets.
  • .tsv are tab separated files referring to one sample per line, usally with normalized numbers in each column. These are good for loading into openFrameworks apps, or into the browser.
  • .txt are like .tsv but only have one item per line, usually a single string. Also good for loading into openFrameworks apps, or into the browser.
  • .pkl are Pickle files, which is the native Python serialization format, and is used for saving and loading datastructures that have lists of objects with lots of different kinds of values (not just numbers or strings).
  • .h5 is the way the Keras saves the weights for a neural net.
  • .json is good for taking what would usually go into a Pickle file, and saving it in a format that can be loaded onto the web. It's also one of the formats used by Keras, part of a saved model.

Example Workflows

Audio spritesheet

  1. Collect Samples
  2. Samples to Audio Spritesheet

t-SNE embedding for samples

  1. Collect Samples
  2. Samples to Fingerprints
  3. Fingerprints to t-SNE (with mode = "fingerprints")

The standard workflow is to create a t-SNE embedding from fingerprints, but it's also possible to create an embedding after learning a classifier:

  1. Collect Samples
  2. Samples to Fingerprints
  3. Collect Metadata
  4. Metadata to Labels
  5. Fingerprints and Labels to Classifier
  6. Fingerprints to t-SNE (with mode = "combined")

t-SNE embedding for phonemes

Right this only really works with extracting phonemes from transcribed speech, using Gentle.

  1. Gentle to Samples (with save_wav = True)
  2. Samples to Fingerprints
  3. Fingerprints to t-SNE

It's also possible to use Sphinx for speech that does not have transcriptions, but it can be very significantly slower:

  1. Sphinx to Samples
  2. Collect Samples
  3. Samples to Fingerprints
  4. Fingerprints to t-SNE

t-SNE grid fingerprints spritesheet

By virtue of creating a rectangular grid, you may lose some points. This technique will only work on 10-20k points maximum

  1. Collect Samples
  2. Samples to Fingerprints
  3. Fingerprints to t-SNE
  4. Run the example-data app from ofxAssignment or use CloudToGrid to convert a 2d t-SNE embedding to a grid embedding.
  5. Fingerprints to Spritesheet

If you only want a spritesheet without any sorting, skip step 4 and only run step 5 partially.

Predict tags given tagged audio

  1. Collect Samples
  2. Samples to Fingerprints
  3. Collect Metadata
  4. Metadata to Labels
  5. Fingerprints and Labels to Classifier

More Repositories

1

FreeWifi

How to get free wifi.
Python
2,870
star
2

ofxFaceTracker

CLM face tracking addon for openFrameworks based on Jason Saragih's FaceTracker.
C++
1,383
star
3

FaceTracker

Real time deformable face tracking in C++ with OpenCV 3.
C++
996
star
4

ofxCv

Alternative approach to interfacing with OpenCv from openFrameworks.
C++
655
star
5

Parametric-t-SNE

Running parametric t-SNE by Laurens Van Der Maaten with Octave and oct2py.
Jupyter Notebook
264
star
6

AppropriatingNewTechnologies

A half-semester class at ITP.
C++
252
star
7

cv-examples

A collection of computer vision examples in JavaScript for the browser.
JavaScript
237
star
8

Coloring-t-SNE

Exploration of methods for coloring t-SNE.
Jupyter Notebook
220
star
9

ethereum-nft-activity

Estimate the total emissions for popular CryptoArt platforms.
Jupyter Notebook
183
star
10

ml-notebook

Dockerfile for multiple machine learning tools.
Shell
162
star
11

ofxFft

FFT addon for openFrameworks that wrapps FFTW and KissFFT.
C++
139
star
12

SmileCNN

Smile detection with a deep convolutional neural net, with Keras.
Jupyter Notebook
138
star
13

ofxCcv

libccv addon for openFrameworks
C
123
star
14

ofxEdsdk

Interfacing with Canon cameras from openFrameworks for OSX. An alternative to ofxCanon and CanonCameraWrapper.
C++
111
star
15

nvidia-co2

Adds gCO2eq emissions to nvidia-smi.
Python
110
star
16

OpenFit

Open source jeans.
Processing
109
star
17

ml-examples

Examples of machine learning, with an emphasis on deep learning.
Jupyter Notebook
109
star
18

CloudToGrid

Example of converting a 2d point cloud to a 2d grid via the assignment problem.
Jupyter Notebook
96
star
19

python-utils

Disorganized collection of useful functions for working with audio and images, especially in the context of machine learning.
Python
93
star
20

LightLeaks

An immersive installation built from a pile of mirror balls and a few projectors.
Jupyter Notebook
92
star
21

openFrameworksDemos

Collection of assorted demos and examples for openFrameworks that don't fit anywhere else.
C++
92
star
22

Makerbot

Experiments and projects while in residence at Makerbot Industries.
C++
91
star
23

gpt-2-poetry

Generating poetry with GPT-2.
Jupyter Notebook
89
star
24

ofxDmx

DMX Pro wrapper for openFrameworks
C++
83
star
25

ofxBlackmagic

Simplified and optimized Black Magic DeckLink SDK grabber.
C++
79
star
26

ethereum-emissions

Estimating the daily energy usage for Ethereum.
Jupyter Notebook
75
star
27

ofxBlur

A very fast, configurable GPU blur addon that can also simulate bloom and different kernel shapes.
C++
64
star
28

ofxAssignment

A tool for matching point clouds or other kinds of data. Useful for making grids from point clouds.
C++
62
star
29

ExhaustingACrowd

JavaScript
53
star
30

SharingFaces

C++
48
star
31

COVIDPause

Chrome extension for pausing all mentions of COVID-19.
JavaScript
45
star
32

SharingInterviews

A collection of interviews about creators sharing work, with an emphasis on open source, media art, and digital communities.
44
star
33

i2i-realtime

Python
44
star
34

ofxFaceShift

Network-based addon for interfacing with FaceShift Studio from openFrameworks.
C++
39
star
35

KernelizedSorting

Mirror of Kernelized Sorting code by Novi Quadrianto.
Python
39
star
36

BlindSelfPortrait

An interactive installation that guides your hand to draw a self portrait.
Jupyter Notebook
38
star
37

ImageRearranger

Rearrange mosaics by similarity.
Jupyter Notebook
37
star
38

ofxCameraFilter

A one-shot effect for simulating: vignetting, lens distortion, chromatic aberration, blur/bloom, and noise grain.
C++
36
star
39

ofxTesseract

tesseract-ocr wrapper for openFrameworks
C++
33
star
40

arxiv-visual-summary

Tool for extracting a visual summary of new papers uploaded to ArXiv.
HTML
33
star
41

EmbeddingScripts

Collection of scripts for visualizing high dimensional data with scikit-learn and bh_tsne
Python
32
star
42

ofxFaceTracker-iOS

Example of using ofxFaceTracker on iOS.
Objective-C++
31
star
43

ofxTiming

Timing utilities for handling recurring events, fading, framerate counting.
C++
31
star
44

ofxLibdc

Open Frameworks wrapper for libdc1394.
C
30
star
45

ofxVirtualKinect

Creates a virtual kinect depth image from an arbitrary position and orientation, using ofxKinect.
C++
30
star
46

mueller-unredacter

Generating text completions based on the Mueller report
HTML
28
star
47

whopaysartists

EJS
27
star
48

ofxAudioDecoder

An openFrameworks addon for m4a/aac, mp3, wav, and other file loading.
C++
27
star
49

ofxAutostereogram

Small library for producing autostereograms, as popularized by the "Magic Eye" book series.
C++
27
star
50

covid-mobility-data

Simple script for digitizing the plots in .pdf files from Google's "Community Mobile Reports".
Python
27
star
51

3dsav

Code for 3d Sensing and Visualization class.
C++
25
star
52

ofxZxing

openFrameworks wrapper of ZXing for detecting and decoding QR Codes in real time.
C++
23
star
53

structured-light

Automatically exported from code.google.com/p/structured-light
C++
21
star
54

Messages

Endless Bytebeat synthesis. Generative shader code for audio and visuals.
C++
21
star
55

Eyeshine

C
21
star
56

SoundParts

Collection of classes for working with sound in C++.
C++
21
star
57

ofxLaunchpad

Interface for Novation Launchpad MIDI controller.
C++
19
star
58

MultiscaleTuring

An implementation of multiscale turing patterns with openFrameworks and OpenCV.
C++
18
star
59

reverse-tunnel

Make a reverse tunnel from OSX to a Linux machine.
Python
18
star
60

facepp

Face tracking and augmentation: a collaboration between Zach Lieberman, Daito Manabe, and Kyle McDonald.
C++
18
star
61

ofxPathfinder

Small and efficient A* pathfinding addon for openFrameworks, supporting variable terrain costs.
C++
17
star
62

prnetjs

Port of PRNet face analysis tool to JavaScript using TensorFlow.js
HTML
17
star
63

socialroulette.net

PHP
16
star
64

ofxMetaballs

Metaballs implementations for openFrameworks using marching cubes and marching tetrahedrons.
C++
16
star
65

sakoku-explorer

Explore your data from Facebook and Google.
Svelte
16
star
66

FisheyeToEquirectangular

Scripts for converting pairs of Hikvision fisheye videos to equirectangular videos.
Python
15
star
67

ofxHeadPoseEstimator

openFrameworks example using ofxKinect to demonstrate research from Gabriele Fanelli.
C++
15
star
68

Transcranial

Interactive dance performance with Klaus Obermaier and Daito Manabe.
Max
14
star
69

ScreenLab

ScreenLab 0x02 residency with Joanie Lemercier.
C++
14
star
70

ableton-web-sync

JavaScript
14
star
71

prores-raw-export

Objective-C
13
star
72

ofxBvh

openFrameworks addon for parsing, rendering, manipulating and saving BVH files.
C++
13
star
73

ofxConnexion

Wraps 3dConnexionClient for openFrameworks on OSX
C++
13
star
74

ofxCurvesTool

An interface for controlling a 1D cubic spline, continuously evaluated and stored in a lookup table.
C++
13
star
75

DohaInstallation

Multi-monitor interactive installation for Wafaa Bilal's 3rdi.
C++
12
star
76

DigitalInteraction

Code related to the FITC 2013 "Digital Interaction" workshop with Daito Manabe.
C++
11
star
77

Barneys

Work on a custom 4m sculpture designed to scatter light in every direction.
JavaScript
11
star
78

BaristaBot

BaristaBot draws your portrait in your latte.
C++
11
star
79

UVCExample

Example of using libuvc with openFrameworks on Mac.
C
10
star
80

HowWeActTogether-Tracking

Facetracking for How We Act Together.
JavaScript
10
star
81

t-SNEPreprocessingComparison

Comparison of two techniques for pre-processing data for t-SNE (PCA and convolutional autoencoder).
Jupyter Notebook
10
star
82

tSNESearch

Example of loading t-SNE organized sounds into openFrameworks.
C++
9
star
83

Serendipity

A visualization: every second a few people hit "play" on the same Spotify track.
JavaScript
9
star
84

Roseheading

Endless glitch facets of a "fractured, frozen" mosaic, our data in the cloud.
Java
9
star
85

TheJanusMachine

C++
8
star
86

PhotoMosaic

PhotoMosaic app that loads from a folder of images and regularly transitions.
C++
8
star
87

3dCalibration

Tools for calibrating 3d cameras to 2d cameras using openFrameworks.
C++
8
star
88

AndyWarholMachine

Interactive installation for "Andy Warhol: Manufactured" at the Anchorage Museum.
C++
8
star
89

ofxVCGLib

VCG for OF: based on work from Akira-Hayasaka, wrapping the VCG library for OF friendliness
C
8
star
90

ofxVicon

Wrapper for interfacing to the Vicon motion capture system with openFrameworks.
C++
8
star
91

AppleStore

PHP
7
star
92

GoingPublic

Tweets anything sent via direct message that is prefixed with a ~ (tilde).
PHP
7
star
93

Highsight

Cam on wire.
C++
7
star
94

CameraHacking

Processing sketches for an analog+digital camera hacking workshop with Chris Woebken.
Java
7
star
95

facework

Facework
TypeScript
6
star
96

express-photobooth

Example of a basic photobooth with Express, getUserMedia, and canvas-to-blob.
JavaScript
6
star
97

SubdivisionOfRoam

Installation for Chris Milk, in collaboration with Golan Levin and Emily Gobeille.
C++
6
star
98

HappyThings

A background app that automatically posts a screenshot every time you smile.
PHP
6
star
99

kylemcdonald.net

Repository for my website: things that can't be hosted elsewhere.
HTML
6
star
100

everyautocomplete

Get every autocomplete result.
HTML
6
star