• This repository has been archived on 20/Jun/2022
  • Stars
    star
    116
  • Rank 303,894 (Top 6 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created almost 5 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Russian text normalization pipeline for speech-to-text and other applications based on tagging s2s networks

Normalization

Russian STT Text Normalization

Russian text normalization pipeline for speech-to-text and other applications based on tagging s2s networks.

Requirements

  • Python >= 3.6
  • PyTorch >= 1.4 for s2s pipeline
  • tqdm for progress bar
pip install torch
pip install tqdm

Usage

from normalizer import Normalizer

text = 'С 12.01.1943 г. площадь сельсовета — 1785,5 га.'

norm = Normalizer()
result = norm.norm_text(text)
print(result)
>>> С двенадцатого января тысяча девятьсот сорок третьего года площадь сельсовета
>>> — тысяча семьсот восемьдесят пять целых и пять десятых гектара