• Stars
    star
    886
  • Rank 51,520 (Top 2 %)
  • Language
    Python
  • License
    MIT License
  • Created over 1 year ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Whisper command line client compatible with original OpenAI client based on CTranslate2.

PyPI version PyPI downloads

Introduction

Whisper command line client compatible with original OpenAI client based on CTranslate2.

It uses CTranslate2 and Faster-whisper Whisper implementation that is up to 4 times faster than openai/whisper for the same accuracy while using less memory.

Goals of the project:

  • Provide an easy way to use the CTranslate2 Whisper implementation
  • Ease the migration for people using OpenAI Whisper CLI

Installation

To install the latest stable version, just type:

pip install -U whisper-ctranslate2

Alternatively, if you are interested the latest development (non-stable) version from this repository, just tpe:

pip install git+https://github.com/jordimas/whisper-ctranslate2.git

CPU and GPU support

GPU and CPU support are provided by CTranslate2.

It has compatibility with x86-64 and AArch64/ARM64 CPU and integrates multiple backends that are optimized for these platforms: Intel MKL, oneDNN, OpenBLAS, Ruy, and Apple Accelerate.

GPU execution requires the NVIDIA libraries cuBLAS 11.x and cuDNN 8.x to be installed on the system. Please refer to the CTranslate2 documentation

By default the best hardware available is selected for inference. You can use the options --device and --device_index to control manually the selection.

Usage

Same command line that OpenAI Whisper.

To transcribe:

whisper-ctranslate2 inaguracio2011.mp3 --model medium

image

To translate:

whisper-ctranslate2 inaguracio2011.mp3 --model medium --task translate

image

Whisper translate task translates the transcription from the source language to English (the only target language supported).

Additionally using:

whisper-ctranslate2 --help

All the supported options with their help are shown.

CTranslate2 specific options

On top of the OpenAI Whisper command line options, there are some specific options provided by CTranslate2 or whiper-ctranslate2.

Quantization

--compute_type option which accepts default,auto,int8,int8_float16,int16,float16,float32 values indicates the type of quantization to use. On CPU int8 will give the best performance:

whisper-ctranslate2 myfile.mp3 --compute_type int8

Loading the model from a directory

--model_directory option allows to specify the directory from which you want to load a CTranslate2 Whisper model. For example, if you want to load your own quantified Whisper model version or using your own Whisper fine-tunned version. The model must be in CTranslate2 format.

Using Voice Activity Detection (VAD) filter

--vad_filter option enables the voice activity detection (VAD) to filter out parts of the audio without speech. This step uses the Silero VAD model:

whisper-ctranslate2 myfile.mp3 --vad_filter True

The VAD filter accepts multiple additional options to determine the filter behavior:

--vad_threshold VALUE (float)

Probabilities above this value are considered as speech.

--vad_min_speech_duration_ms (int)

Final speech chunks shorter min_speech_duration_ms are thrown out.

--vad_max_speech_duration_s VALUE (int)

Maximum duration of speech chunks in seconds. Longer will be split at the timestamp of the last silence.

Print colors

--print_colors True options prints the transcribed text using an experimental color coding strategy based on whisper.cpp to highlight words with high or low confidence:

whisper-ctranslate2 myfile.mp3 --print_colors True

image

Live transcribe from your microphone

--live_transcribe True option activates the live transcription mode from your microphone:

whisper-ctranslate2 --live_transcribe True --language en
whisper-demo.mov

Need help?

Check our frequently asked questions for common questions.

Contact

Jordi Mas [email protected]

More Repositories

1

catalan-dict-tools

Tools for managing Catalan dictionaries
Perl
48
star
2

nmt-softcatala

This repository contains Neural Machine Translation tools built at Softcatalà
Python
39
star
3

translation-memory-tools

A set of tools to build, maintain and use translation memories
Python
28
star
4

LanguageToolAndroidService

Experiments with spell checking on Android
Java
28
star
5

Catalanitzador

A Microsoft Windows & Mac OS program that makes your system Catalan language friendly
C++
23
star
6

nmt-models

Softcatalà neural translation models
Portugol
15
star
7

ca-text-corpus

Public domain corpus of Catalan text
15
star
8

julibert

Catalan bert model
Python
12
star
9

parallel-catalan-corpus

Open source bilingual Catalan corpus used to train machine learning systems
NewLisp
9
star
10

TraductorSoftcatalaAndroid

Android client to access Softcatalà on-line translation service
Java
8
star
11

conjugador

conjugador
Python
7
star
12

sinonims-cat

Diccionari de sinònims
Java
5
star
13

CatalanitzadorMozilla

Catalanitzador per als programes Mozilla - Eases having Mozilla programs (Firefox, etc.) in Catalan language
JavaScript
5
star
14

TraductorGnomeShell

This is a gnome-shell extension that let users enter a text, select the language pair for the translation and translate the text.
JavaScript
5
star
15

transcribe-service

Python
4
star
16

recull-de-termes

Recull de termes
C#
4
star
17

llistatpaisos

Repository to manage and display country names in Catalan language
CSS
3
star
18

SoftcatalaTelegramBot

Telegram bot for Softcatalà
Python
3
star
19

Europarl-catalan

Aligned Catalan-German and Catalan-English Europarl corpus. Catalan sentences translated from Spanish using Apertium RBMT.
2
star
20

diccionari-multilingue

English / Catalan dictionary
Python
2
star
21

softcatala-web-dataset

Datasets with Softcatalà website content
Python
2
star
22

Catalanitzador-binaries

Binary files for the Catalanitzador project
2
star
23

catalan-pology-rules

Softcatalà's Pology rules for Catalan language
Python
2
star
24

pccd

Accés en línia a la base de dades de Víctor Pàmies.
PHP
2
star
25

hora-catalana

Hora en català
JavaScript
2
star
26

corrector-ortografic

Corrector ortogràfic de Softcatalà
C++
2
star
27

adaptadorvariants

Adaptador a variant dialectal valenciana
sed
2
star
28

mw-softcatala

MediaWiki Skin for Softcatala Website
PHP
1
star
29

filter-wiki-corpus-lt

Extract sentences from Wikipedia using LanguageTool
Java
1
star
30

web-2015

Repositori per a la versió 3 de la web de Softcatalà
HTML
1
star
31

webui-languagetool

OBSOLET: Interfície a SC del corrector LanguageTool - SC web's interface around LanguageTool
JavaScript
1
star
32

nombres-en-lletres

escriu nombres en lletres
JavaScript
1
star
33

catalan-iso-data

Python
1
star
34

sc-traductor-log

JavaScript
1
star
35

auto-translate

Programa que utilitza les traduccions en un altre llengua amb combinació d'Apertium i memòries de traducció per accelerar el procés de traducció.
Python
1
star
36

resum-guiaestil

Resum guia d'estil
TeX
1
star
37

mce-table-buttons-class

Extension to add class support to mce-table-buttons plugin
PHP
1
star
38

languagetool-scripts

Maintenance scripts for languagetool
Shell
1
star
39

wp-softcatala

Tema WordPress utilitzat pel web de Softcatalà
JavaScript
1
star
40

apertium-tests

Shell
1
star
41

separador-sillabes

Separador i comptador de síl·labes en català
JavaScript
1
star
42

web-Softcatala

Common content used by different software/services in Softcatalà website
PHP
1
star