• Stars
    star
    5,789
  • Rank 7,018 (Top 0.2 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created almost 9 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding

Neural speaker diarization with pyannote.audio

pyannote.audio is an open-source toolkit written in Python for speaker diarization. Based on PyTorch machine learning framework, it provides a set of trainable end-to-end neural building blocks that can be combined and jointly optimized to build speaker diarization pipelines.

TL;DR Open In Colab

# 1. visit hf.co/pyannote/speaker-diarization and hf.co/pyannote/segmentation and accept user conditions (only if requested)
# 2. visit hf.co/settings/tokens to create an access token (only if you had to go through 1.)
# 3. instantiate pretrained speaker diarization pipeline
from pyannote.audio import Pipeline
pipeline = Pipeline.from_pretrained("pyannote/speaker-diarization",
                                    use_auth_token="ACCESS_TOKEN_GOES_HERE")

# 4. apply pretrained pipeline
diarization = pipeline("audio.wav")

# 5. print the result
for turn, _, speaker in diarization.itertracks(yield_label=True):
    print(f"start={turn.start:.1f}s stop={turn.end:.1f}s speaker_{speaker}")
# start=0.2s stop=1.5s speaker_0
# start=1.8s stop=3.9s speaker_1
# start=4.2s stop=5.7s speaker_0
# ...

Highlights

Installation

Only Python 3.8+ is supported.

# install from develop branch
pip install -qq https://github.com/pyannote/pyannote-audio/archive/refs/heads/develop.zip

Documentation

Benchmark

Out of the box, pyannote.audio default speaker diarization pipeline is expected to be much better (and faster) in v2.x than in v1.1. Those numbers are diarization error rates (in %)

Dataset \ Version v1.1 v2.0 v2.1.1 (finetuned)
AISHELL-4 - 14.6 14.1 (14.5)
AliMeeting (channel 1) - - 27.4 (23.8)
AMI (IHM) 29.7 18.2 18.9 (18.5)
AMI (SDM) - 29.0 27.1 (22.2)
CALLHOME (part2) - 30.2 32.4 (29.3)
DIHARD 3 (full) 29.2 21.0 26.9 (21.9)
VoxConverse (v0.3) 21.5 12.6 11.2 (10.7)
REPERE (phase2) - 12.6 8.2 ( 8.3)
This American Life - - 20.8 (15.2)

Citations

If you use pyannote.audio please use the following citations:

@inproceedings{Bredin2020,
  Title = {{pyannote.audio: neural building blocks for speaker diarization}},
  Author = {{Bredin}, Herv{\'e} and {Yin}, Ruiqing and {Coria}, Juan Manuel and {Gelly}, Gregory and {Korshunov}, Pavel and {Lavechin}, Marvin and {Fustes}, Diego and {Titeux}, Hadrien and {Bouaziz}, Wassim and {Gill}, Marie-Philippe},
  Booktitle = {ICASSP 2020, IEEE International Conference on Acoustics, Speech, and Signal Processing},
  Year = {2020},
}
@inproceedings{Bredin2021,
  Title = {{End-to-end speaker segmentation for overlap-aware resegmentation}},
  Author = {{Bredin}, Herv{\'e} and {Laurent}, Antoine},
  Booktitle = {Proc. Interspeech 2021},
  Year = {2021},
}

Support

For commercial enquiries and scientific consulting, please contact me.

Development

The commands below will setup pre-commit hooks and packages needed for developing the pyannote.audio library.

pip install -e .[dev,testing]
pre-commit install

Tests rely on a set of debugging files available in test/data directory. Set PYANNOTE_DATABASE_CONFIG environment variable to test/data/database.yml before running tests:

PYANNOTE_DATABASE_CONFIG=tests/data/database.yml pytest

More Repositories

1

pyannote-video

Face detection, tracking and clustering in videos
Python
420
star
2

pyannote-metrics

A toolkit for reproducible evaluation, diagnostic, and error analysis of speaker diarization systems
Python
177
star
3

pyannote-core

Advanced data structures for handling temporal segments with attached labels.
Jupyter Notebook
87
star
4

pyannote-database

Reproducible experimental protocols for multimedia (audio, video, text) database
Python
76
star
5

DEPRECATED-pyannote-audio-hub

[deprecated] Pretrained models for pyannote-audio 1.x
70
star
6

pyannote-db-voxceleb

VoxCeleb plugin for pyannote.database
Python
28
star
7

pyannote-pipeline

Tunable pipelines
Python
20
star
8

hf-speaker-diarization-3.1

Mirror of hf.co/pyannote/speaker-diarization-3.1
Python
6
star
9

pyannote-db-odessa-ami

AMI plugin for pyannote.database (as used in ODESSA project)
Python
6
star
10

DEPRECATED-pyannote-algorithms

[deprecated] Various algorithms used all over pyannote ecosystem
Python
6
star
11

DEPRECATED-pyannote-db-template

[deprecated] Template for creating your own pyannote.database plugin
Python
6
star
12

DEPRECATED-pyannote-features

[deprecated] Audio and textual features extraction.
Python
6
star
13

pyannote-data

5
star
14

DEPRECATED-pyannote-docker

[deprecated] Docker images for pyannote
3
star
15

DEPRECATED-pyannote-server

[deprecated] REST API for remote access to pyannote.
Python
2
star
16

DEPRECATED-pyannote-workflows

[deprecated] Various sciluigi tasks and worfklows
Python
2
star
17

DEPRECATED-pyannote-parser

[deprecated] File parsers to load data as pyannote.core structures.
Python
2
star
18

pyannote-db-thebigbangtheory

"The Big Bang Theory" plugin for pyannote.database
Python
2
star
19

DEPRECATED-pyannote.js

[deprecated]
JavaScript
2
star
20

pyannote-db-musan

MUSAN plugin for pyannote.database
Python
1
star
21

hf-pretrained-pipelines

Pretrained pipelines demo
Python
1
star
22

pyannote-audio-demo-DEPRECATED

Python
1
star
23

DEPRECATED-pyannote-generators

[deprecated] itertools for pyannote
Python
1
star