WeNet
Roadmap | Docs | Papers | Runtime | Pretrained Models | HuggingFace
We share Net together.
🔥
News - 2022.12: Horizon X3 pi BPU, see #1597, Kunlun Core XPU, see #1455, Raspberry Pi, see #1477, IOS, see #1549.
- 2022.11: TrimTail paper released, see https://arxiv.org/pdf/2211.00522.pdf
Highlights
- Production first and production ready: The core design principle, WeNet provides full stack production solutions for speech recognition.
- Accurate: WeNet achieves SOTA results on a lot of public speech datasets.
- Light weight: WeNet is easy to install, easy to use, well designed, and well documented.
Install
please refer doc for install.
Discussion & Communication
You can directly discuss on Github Issues.
For Chinese users, you can aslo scan the QR code on the left to follow our offical account of WeNet. We created a WeChat group for better discussion and quicker response. Please scan the personal QR code on the right, and the guy is responsible for inviting you to the chat group.
Acknowledge
- We borrowed a lot of code from ESPnet for transformer based modeling.
- We borrowed a lot of code from Kaldi for WFST based decoding for LM integration.
- We referred EESEN for building TLG based graph for LM integration.
- We referred to OpenTransformer for python batch inference of e2e models.
Citations
@inproceedings{yao2021wenet,
title={WeNet: Production oriented Streaming and Non-streaming End-to-End Speech Recognition Toolkit},
author={Yao, Zhuoyuan and Wu, Di and Wang, Xiong and Zhang, Binbin and Yu, Fan and Yang, Chao and Peng, Zhendong and Chen, Xiaoyu and Xie, Lei and Lei, Xin},
booktitle={Proc. Interspeech},
year={2021},
address={Brno, Czech Republic },
organization={IEEE}
}
@article{zhang2022wenet,
title={WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit},
author={Zhang, Binbin and Wu, Di and Peng, Zhendong and Song, Xingchen and Yao, Zhuoyuan and Lv, Hang and Xie, Lei and Yang, Chao and Pan, Fuping and Niu, Jianwei},
journal={arXiv preprint arXiv:2203.15455},
year={2022}
}