• Stars
    star
    299
  • Rank 139,269 (Top 3 %)
  • Language
    Python
  • Created over 4 years ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Speech Recognition for Ukrainian

Speech Recognition for Ukrainian 🇺🇦

The goal of this repository is to collect information and datasets for Ukrainian automatic speech recognition aka speech-to-text.

Also, this repository contains information about Ukrainian speech synthesis aka text-to-speech.

Or you can start a discussion.

Donate

You can support our work by donation:

🎤 Speech-to-Text

💡 Implementations

wav2vec2

You can check demos out here: https://github.com/egorsmkv/wav2vec2-uk-demo

data2vec

Citrinet

ContextNet

FastConformer

Squeezeformer

Silero

VOSK

Note: VOSK models are licensed under Apache License 2.0.

DeepSpeech

M-CTC-T

whisper

Flashlight

📊 Benchmarks

This benchmark uses Common Voice 10 test split.

wav2vec2

Model WER CER Accuracy, % WER+LM CER+LM Accuracy+LM, %
Yehor/wav2vec2-xls-r-1b-uk-with-lm 0.1807 0.0317 81.93% 0.1193 0.0218 88.07%
Yehor/wav2vec2-xls-r-1b-uk-with-binary-news-lm 0.1807 0.0317 81.93% 0.0997 0.0191 90.03%
Yehor/wav2vec2-xls-r-300m-uk-with-lm 0.2906 0.0548 70.94% 0.172 0.0355 82.8%
Yehor/wav2vec2-xls-r-300m-uk-with-news-lm 0.2027 0.0365 79.73% 0.0929 0.019 90.71%
Yehor/wav2vec2-xls-r-300m-uk-with-wiki-lm 0.2027 0.0365 79.73% 0.1045 0.0208 89.55%
Yehor/wav2vec2-xls-r-base-uk-with-small-lm 0.4441 0.0975 55.59% 0.2878 0.0711 71.22%
robinhad/wav2vec2-xls-r-300m-uk 0.2736 0.0537 72.64% - - -
arampacha/wav2vec2-xls-r-1b-uk 0.1652 0.0293 83.48% 0.0945 0.0175 90.55%

Citrinet

lm-4gram-500k is used as the LM

Model WER CER Accuracy, % WER+LM CER+LM Accuracy+LM, %
nvidia/stt_uk_citrinet_1024_gamma_0_25 0.0432 0.0094 95.68% 0.0352 0.0079 96.48%
neongeckocom/stt_uk_citrinet_512_gamma_0_25 0.0746 0.016 92.54% 0.0563 0.0128 94.37%

ContextNet

Model WER CER Accuracy, %
theodotus/stt_uk_contextnet_512 0.0669 0.0145 93.31%

FastConformer P&C

This model supports text punctuation and capitalization

Model WER CER Accuracy, % WER+P&C CER+P&C Accuracy+P&C, %
theodotus/stt_ua_fastconformer_hybrid_large_pc 0.0400 0.0102 96.00% 0.0710 0.0167 92.90%

Squeezeformer

lm-4gram-500k is used as the LM

Model WER CER Accuracy, % WER+LM CER+LM Accuracy+LM, %
theodotus/stt_uk_squeezeformer_ctc_xs 0.1078 0.0229 89.22% 0.0777 0.0174 92.23%
theodotus/stt_uk_squeezeformer_ctc_sm 0.082 0.0175 91.8% 0.0605 0.0142 93.95%
theodotus/stt_uk_squeezeformer_ctc_ml 0.0591 0.0126 94.09% 0.0451 0.0105 95.49%

Flashlight

lm-4gram-500k is used as the LM

Model WER CER Accuracy, % WER+LM CER+LM Accuracy+LM, %
Flashlight Conformer 0.1915 0.0244 80.85% 0.0907 0.0198 90.93%

data2vec

Model WER CER Accuracy, %
robinhad/data2vec-large-uk 0.3117 0.0731 68.83%

VOSK

Model WER CER Accuracy, %
v3 0.5325 0.3878 46.75%

Silero

Model WER CER Accuracy, %
snakers4/silero-models 0.2356 0.0646 76.44%

m-ctc-t

Model WER CER Accuracy, %
speechbrain/m-ctc-t-large 0.57 0.1094 43%

whisper

Model WER CER Accuracy, %
tiny 0.6308 0.1859 36.92%
base 0.521 0.1408 47.9%
small 0.3057 0.0764 69.43%
medium 0.1873 0.044 81.27%
large (v1) 0.1642 0.0393 83.58%
large (v2) 0.1372 0.0318 86.28%

Fine-tuned version for Ukrainian:

Model WER CER Accuracy, %
small 0.2704 0.0565 72.96%
large 0.2482 0.055 75.18%

If you want to fine-tune a Whisper model on own data, then use this repository: https://github.com/egorsmkv/whisper-ukrainian

DeepSpeech

Model WER CER Accuracy, %
v0.5 0.7025 0.2009 29.75%

📖 Development

📚 Datasets

Compiled dataset from different open sources + Companies + Community = 188.31GB / ~1200 hours 💪

Voice of America (398 hours)

Companies

Cleaned Common Voice 10 (test set)

Noised Common Voice 10

Community

Other

Related works

Language models

Inverse Text Normalization:

Text Enhancement

📢 Text-to-Speech

Test sentence with stresses:

К+ам'ян+ець-Под+ільський - м+істо в Хмельн+ицькій +області Укра+їни, ц+ентр Кам'ян+ець-Под+ільської міськ+ої об'+єднаної територі+альної гром+ади +і Кам'ян+ець-Под+ільського рай+ону.

Without stresses:

Кам'янець-Подільський - місто в Хмельницькій області України, центр Кам'янець-Подільської міської об'єднаної територіальної громади і Кам'янець-Подільського району.

💡 Implementations

RAD-TTS

demo.mp4

Silero TTS

silero.mp4

Coqui TTS

tts_output.mp4

Neon TTS

neon_tts.mp4

📚 Datasets

Related works

Accentors

More Repositories

1

simple-django-login-and-register

An example of Django project with basic user functionality.
Python
817
star
2

openapi3-generator

A generator for OpenAPI 3
Python
97
star
3

asr-corpus-creator

This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.
Python
27
star
4

ukrainian-accentor

Add accents to words in the Ukrainian language
Jupyter Notebook
15
star
5

ua-silero-demo

Speech-to-Text for the Ukrainian language based on Silero
Python
15
star
6

NLLB-Translator

Python
15
star
7

whisper-ukrainian

Trainer and Evaluation scripts for fine-tuning Whisper models for the Ukrainian language
Python
15
star
8

libra-grpc-py

gRPC client for Libra in Python
Python
13
star
9

qirimtatar-tts-datasets

Open Source Crimean Tatar Text-to-Speech datasets
Python
13
star
10

audio-katana

A tool to slice your audio files into chunks using the Voice Activity Detection technique
Python
11
star
11

diagrams-telephony

Classes of Telephony for the Python's diagrams package
Python
10
star
12

radtts-hifigan

RADTTS + HiFiGAN vocoder
Python
10
star
13

fail2ban-scripts

This is a simple python code to get the list of banned IP addresses from Fail2ban
Python
9
star
14

django-encore

Django integration with Webpack Encore
Python
8
star
15

ukrainian-stt-bot

Telegram bot for Ukrainian Speech-to-Text
Python
7
star
16

flashlight-ukrainian

The Ukrainian Acoustic Model for Flashlight
7
star
17

wav2vec2-uk-demo

Demo of Ukrainian wav2vec2 model
Python
7
star
18

feedly_search

Utility for search RSS feeds on https://feedly.com
Go
5
star
19

laravel-boilerplate

A Laravel Boilerplate with Batteries on Modern Technologies
PHP
5
star
20

radtts-uk

🇺🇦 Ukrainian RAD-TTS++ models (decoder + models with 3 voices) and HiFiGAN model
4
star
21

wmsigner

WebMoney Signer для Python
Python
4
star
22

asr-tg-bot-corpus

An ASR Corpus created using a Telegram bot for Ukrainian
4
star
23

share-image-server

A generation server for preview images
Go
4
star
24

useful-stuff

Scripts, configs, docker files, commands which are useful
Jinja
4
star
25

tts-silero-bot

This Telegram bot shows Silero TTS
Python
4
star
26

voicefilter

Inference part of VoiceFilter
Python
3
star
27

ukrainian-onnx-model

An ONNX model for speech recognition of the Ukrainian language
Python
3
star
28

vosk-ukrainian-demo

VOSK demo for the Ukrainian language
Python
3
star
29

wav2vec2-jit

Export a traced JIT version of wav2vec2 models
Python
3
star
30

oberon-cgi

The answer to a question: "How to use Oberon in the Web?"
Go
3
star
31

lang-detector-bot

Telegram bot for language detection from Voice based on Silero VAD
Python
3
star
32

privatbank-send-money

Простой скрипт для отправки денег в ПриватБанке через Мерчант API
Ruby
3
star
33

radtts-istftnet

RADTTS + iSTFTNet vocoder
Python
3
star
34

test-wav2vec2-by-microphone

A small script to test wav2vec2 models using a microphone
Python
3
star
35

kvm-over-ip-cn8000a-jnlp-client

The instruction that helps you to run the Java Client from ATEN's software "KVM over IP"
Dockerfile
3
star
36

xeus-finetune

XEUS training code
Python
2
star
37

short-urls-service

This is a pet project where I used the Alpas web framework. Nothing big, just a service to short links.
JavaScript
2
star
38

spam-kill-robot

Telegram бот для удаления спам сообщений.
Go
2
star
39

mysql-faq

Часто задаваемые вопросы по MySQL
2
star
40

vosk-ukrainian-stt-bot

Telegram Bot: Speech-to-Text for the Ukrainian language based on VOSK
Python
2
star
41

ukrainian-radtts

Text-to-Speech for Ukrainian using NVIDIA's RADTTS
2
star
42

iSTFTNet-pytorch

Patched original code with some developer additions, don't use in prod
Python
2
star
43

AdminLTE2-All-in-One

This is just AdminLTE (the second version) with the bower_components folder that was cleaned (only dist files).
JavaScript
2
star
44

do-func-golang-example

DigitalOcean Function written in Go with DevContainer and Tests
Go
2
star
45

river-demo

Some code to test https://github.com/riverqueue/river
Go
2
star
46

feed-supermaster

https://github.com/egorsmkv/feed-master with some changes
Go
2
star
47

export-facebook-insights-to-bigquery

This project gives you ability to export data from Facebook/Meta Business to BigQuery.
Python
2
star
48

taskiq-pgsql-rabbitmq

Taskiq: PostgreSQL (store results) and RabbitMQ (broker)
Python
1
star
49

gpu-state-tgbot

This little bot shows info about your GPUs
Go
1
star
50

fsmn-vad-demo

Python
1
star
51

vocos

Python
1
star
52

decimal

A fork of https://github.com/rtlopez/decimal
PHP
1
star
53

ikonboard

For future's internet archeologists.
Perl
1
star
54

ofront-omake

This repository shows how to use Ofront+ and OMake together to build Oberon/Component Pascal programs
Shell
1
star
55

microphone-recorder

Record WAV files using your microphone with Python
Python
1
star
56

ukwiki-kenlm

Instruction about building a KenLM model based on Ukrainian Wikipedia data
Python
1
star
57

flair-nlp-uk

Ukrainian NLP using Flair models
Python
1
star
58

privat24-merchant-demo

Демонстрация покупки товаров на сайте через Приват24 Мерчант
PHP
1
star
59

STAR-Adapt-uk

Fork of https://github.com/YUCHEN005/STAR-Adapt with some modifications for Ukrainian.
Python
1
star
60

simple-django-login-and-register-dynamic-lang

An example of Django project with basic user functionality with dynamic language
Python
1
star
61

ukrainian-flowtron-tts

Text-to-Speech for Ukrainian using NVIDIA's Flowtron
1
star
62

CleanUNet

Packaged CleanUNet
Python
1
star
63

qt-multithreading

Implementation of multithreading in Qt.
C++
1
star
64

otp-qr-demo

A demonstration on how to create own OTP authentication
PHP
1
star
65

tg-extract-history

Downloading of messages from Telegram's groups
Python
1
star
66

vosk-websocket-server

Пример кода для запуска VOSK веб-сокет сервера украинского модели
Python
1
star
67

asr-corpus-by-microphone

This is a simple solution for people who want to create own corpus for Automatic Speech Recognition with just a microphone
Python
1
star