• Stars
    star
    4,073
  • Rank 10,561 (Top 0.3 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created almost 4 years ago
  • Updated 19 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Production First and Production Ready End-to-End Speech Recognition Toolkit

WeNet

License Python-Version

Roadmap | Docs | Papers | Runtime | Pretrained Models | HuggingFace

We share Net together.

News 🔥

Highlights

  • Production first and production ready: The core design principle, WeNet provides full stack production solutions for speech recognition.
  • Accurate: WeNet achieves SOTA results on a lot of public speech datasets.
  • Light weight: WeNet is easy to install, easy to use, well designed, and well documented.

Install

please refer doc for install.

Discussion & Communication

You can directly discuss on Github Issues.

For Chinese users, you can aslo scan the QR code on the left to follow our offical account of WeNet. We created a WeChat group for better discussion and quicker response. Please scan the personal QR code on the right, and the guy is responsible for inviting you to the chat group.

Acknowledge

  1. We borrowed a lot of code from ESPnet for transformer based modeling.
  2. We borrowed a lot of code from Kaldi for WFST based decoding for LM integration.
  3. We referred EESEN for building TLG based graph for LM integration.
  4. We referred to OpenTransformer for python batch inference of e2e models.

Citations

@inproceedings{yao2021wenet,
  title={WeNet: Production oriented Streaming and Non-streaming End-to-End Speech Recognition Toolkit},
  author={Yao, Zhuoyuan and Wu, Di and Wang, Xiong and Zhang, Binbin and Yu, Fan and Yang, Chao and Peng, Zhendong and Chen, Xiaoyu and Xie, Lei and Lei, Xin},
  booktitle={Proc. Interspeech},
  year={2021},
  address={Brno, Czech Republic },
  organization={IEEE}
}

@article{zhang2022wenet,
  title={WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit},
  author={Zhang, Binbin and Wu, Di and Peng, Zhendong and Song, Xingchen and Yao, Zhuoyuan and Lv, Hang and Xie, Lei and Yang, Chao and Pan, Fuping and Niu, Jianwei},
  journal={arXiv preprint arXiv:2203.15455},
  year={2022}
}

More Repositories

1

speech-synthesis-paper

List of speech synthesis papers.
987
star
2

wespeaker

Research and Production Oriented Speaker Verification, Recognition and Diarization Toolkit
Python
657
star
3

WenetSpeech

A 10000+ hours dataset for Chinese speech recognition
Shell
488
star
4

WeTextProcessing

Text Normalization & Inverse Text Normalization
Python
443
star
5

wekws

Production First and Production Ready End-to-End Keyword Spotting Toolkit
Python
430
star
6

wetts

Production First and Production Ready End-to-End Text-to-Speech Toolkit
Python
367
star
7

speech-recognition-papers

Towards hot directions in industrial end to end speech recognition
325
star
8

opencpop

Opencpop: A High-Quality Open Source Chinese Popular Song Database for Singing Voice Synthesis
207
star
9

wenet-kws

Production First and Production Ready End-to-End Keyword Spotting Toolkit
Python
142
star
10

west

We Speech Transcript based on LLM, in 300 lines of code.
Python
109
star
11

wesignal

Production first, nn-based on-device signal processing toolkit.
63
star
12

WeTextProcessing.deprecated

C++
61
star
13

wesubtitle

用 OCR 提取视频硬字幕
Python
54
star
14

llm-papers

List of Large Lanugage Model Papers
51
star
15

wesep

Target Speaker Extraction Toolkit
Python
42
star
16

wecut

video cut powered by AI
25
star
17

WeSpeech-AI

Open Source Speech/Text Data on AI
18
star
18

nn-singal-processing-papers

List of NN based singal processing papers
17
star
19

wenet_in_action_homework

WeNet 实战课程作业
Python
16
star
20

wenet-e2e.github.io

WeNet Community
CSS
1
star
21

wenet-contributors

Contributors of WeNet, including individual and companies.
1
star