• Stars
    star
    369
  • Rank 114,994 (Top 3 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 5 years ago
  • Updated almost 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Pytorch port of Google Research's VGGish model used for extracting audio features.

Looking for maintainers - I no longer have the capacity to maintain this project. If you would like to take over maintenence, please get in touch. I will either forward to your fork, or add you as a maintainer for the project. Thanks.


VGGish

A torch-compatible port of VGGish[1], a feature embedding frontend for audio classification models. The weights are ported directly from the tensorflow model, so embeddings created using torchvggish will be identical.

Usage

import torch

model = torch.hub.load('harritaylor/torchvggish', 'vggish')
model.eval()

# Download an example audio file
import urllib
url, filename = ("http://soundbible.com/grab.php?id=1698&type=wav", "bus_chatter.wav")
try: urllib.URLopener().retrieve(url, filename)
except: urllib.request.urlretrieve(url, filename)

model.forward(filename)

[1] S. Hershey et al., ‘CNN Architectures for Large-Scale Audio Classification’,\ in International Conference on Acoustics, Speech and Signal Processing (ICASSP),2017\ Available: https://arxiv.org/abs/1609.09430, https://ai.google/research/pubs/pub45611