• Stars
    star
    125
  • Rank 286,335 (Top 6 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created over 3 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Simplified diarization pipeline using some pretrained models - audio file to diarized segments in a few lines of code

simple_diarizer

Open In Colab

Simplified diarization pipeline using some pretrained models.

Made to be a simple as possible to go from an input audio file to diarized segments.

import soundfile as sf
import matplotlib.pyplot as plt

from simple_diarizer.diarizer import Diarizer
from simple_diarizer.utils import combined_waveplot

diar = Diarizer(
                  embed_model='xvec', # 'xvec' and 'ecapa' supported
                  cluster_method='sc' # 'ahc' and 'sc' supported
               )

segments = diar.diarize(WAV_FILE, num_speakers=NUM_SPEAKERS)

signal, fs = sf.read(WAV_FILE)
combined_waveplot(signal, fs, segments)
plt.show()

Install

Simplified diarization is available on PyPI:

pip install simple-diarizer

Source Video

"Some Quick Advice from Barack Obama!"

YouTube Thumbnail

Pre-trained Models

The following pretrained models are used:

Demo

Open In Colab

It can be checked out in the above link, where it will try and diarize any input YouTube URL.

Other References

Planned Features