• Stars
    star
    107
  • Rank 323,587 (Top 7 %)
  • Language
    Jupyter Notebook
  • Created over 3 years ago
  • Updated about 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Bangla-deep-speech-Recognition

Bangla deep speech recognition is a deep bidirectional RNN based bangla speech to text transcription system. Major focusing for this project is to empower industrial application like searching a product by voice command using bangla speech recognition end to end model, via an easy-to-use, efficient, smaller and scalable implementation, including training, inference & testing module,and deployment.

Dataset

There is two parts of voice datasets:
1)This is own collected dataset and voice corpus generated on based of company product.Here I'am used a small size of voice corpuses like size 40-50 audio files.I can add more voice corpuses to get better result to mitigate overfitting.
2)Bengali ASR training data set containing ~196K utterances. Dataset link:http://openslr.org/53/

Annotation Tools

1)https://online-audio-converter.com/
2)https://twistedwave.com/online

Model

1)rnn model,Lstm model,bidirectional-rnn model,Deep model
2)working on Rnn_Transducer_model on going

Dependency

Python 3.7
tensorflow 2.0.0

Project Structure:

Run above command:
speech_recognition (2).ipynb

results:

Capture-1

References