There are no reviews yet. Be the first to send feedback to the community and the maintainers!
indicnlp_catalog
A collaborative catalog of NLP resources for Indic languagesIndic-BERT-v1
Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages and Indian-English. For latest Indic-BERT v2, check: https://github.com/AI4Bharat/IndicBERTIndicTrans2
Translation models for 22 scheduled languages of Indiaindicnlp_corpus
Description Describes the IndicNLP corpus and associated datasetsIndic-TTS
Text-to-Speech for languages of IndiaindicTrans
indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2OpenHands
👐OpenHands : Making Sign Language Recognition Accessible. | **NOTE:** No longer actively maintained. If you are interested to own this and take it forward, please raise an issueChitralekha
Chitralekha - A video transcreation platform for Indic languages, supporting transcription, translation and voice-overIndicLLMSuite
A blueprint for creating Pretraining and Fine-Tuning datasets for Indic languagesIndicWav2Vec
Pretraining, fine-tuning and evaluation scripts for Indic-Wav2Vec2IndicXlit
Transliteration models for 21 Indic languagesNPTEL2020-Indian-English-Speech-Dataset
NPTEL2020: Speech2Text dataset for Indian-English AccentIndicBERT
Pretraining, fine-tuning and evaluation scripts for IndicBERT-v2 and IndicXTREMEIndicNLP-Transliteration
Codebase for Indic-Transliteration using Seq2Seq RNN. For latest repo with Transformer-based models, check: https://github.com/AI4Bharat/IndicXlitShoonya
Shoonya - Platform to Annotate and label data at scale.vistaar
Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASRindic-bart
Pre-trained, multilingual sequence-to-sequence models for Indian languagesChitralekha-Backend
Transcribe your videos and translate it into Indic languages.Indic-Input-Tool-UI
Web Interface for Transliteration for Indic languages.Shoonya-Backend
DRF-based API server for Shoonya platformSvarah
Swarah: Indian-English speech dataset collected across the countryIndicVoices-R
A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTSFBI
FBI: Finding Blindspots in LLM Evaluations with Interpretable ChecklistsShoonya-Frontend
Dhruva-Platform
Dhruva is an open-source platform for serving language AI models at scale.INCLUDE
Code for INCLUDE paper with pre-trained modelsDocSim
Synthetically generate random text document images with ground-truthFonts-for-Indian-Scripts
Font style transfer for Devanāgarī script using GANsaacl23-mnmt-tutorial
Additional resources from our AACL tutorialadapter-efficiency
IndicLID
Language Identification for Indian languagessetu
Setu is a comprehensive pipeline designed to clean, filter, and deduplicate diverse data sources including Web, PDF, and Speech data. Built on Apache Spark, Setu encompasses four key stages: document preparation, document cleaning and analysis, flagging and filtering, and deduplication.speech-transcript-cleaning
Perform cleaning and normalization to standardize speech transcripts (train and test) across datasets.ezAnnotate
Annotation Platform for Machine Learning / Data Science, forked from DataTurksAnudesh-Frontend
Chitralekha-Frontend
Frontend for Chitralekha platformtransactional-voice-ai
The code for transactional voice AIIndic-Glossary-Explorer
Glossary service for Indian languagesworkshop-nlg-nlu-2022
Material for AI Workshop on Natural Language Understanding and Generationindicnlp.ai4bharat.org
Archived old website for AI4Bhārat Indic-NLPChitralekha-Frontend-Lite
Lightweight version of ChitralekhaIndic-Glossaries
Collection of datasets for glossaries in Indian languagessign-language.ai4bharat.org
Website for Indian Sign Language RecognitionINCLUDE-MS-Teams-Integration
An experimental Microsoft Teams integration of Sign Language models for word-level sign recognitionAnudesh-Backend
IndicMT-Eval
IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation Metrics for Indian Languages, ACL 2023IndicVoices
indic-numtowords
A simple lightweight library for text normalization for Indian LanguagesIndicSUPERB
transactional-voice-ai_serving
Deployment code for all the Transactional Voice AI modules.CTQScorer
Indic-Swipe
IndicSwipe is a collection of datasets and neural model architectures for decoding swipe gesture inputs on touch-based Indic language keyboards across 7 languages.Indic-OCR
DMU-DataDaan
Codebase for NLTM DMU's Data Upload System2022.ai4bharat.org
Old website of AI4Bhārat using TinaCMSsetu-translate
models.ai4bharat.org
A one stop platform to try out all the models built by the AI4Bharat team.Shoonya-Frontend-Old
Old version of Shoonya UI. Latest repo: https://github.com/AI4Bharat/Shoonya-FrontendVarnam-Transliteration-UI
Transliteration Web Interfaceai4b-website
Dhruva-Evaluation-Suite
A tool to perform functional testing and performance testing of the Dhruva Platformindicnlp_suite
Natural Language Understanding resources for Indian languagesInput-Tools-By-AI4bharat
Enhance your typing experience in Chrome with AI4Bharat's Input Tools Chrome extension. This extension provides real-time transliteration suggestions for Indian languages, offering seamless integration into your typing workflow.Lahaja
This repository holds the artifacts of 'LAHAJA: A Robust Multi-accent Benchmark for Evaluating Hindi ASR Systems'Rasa
Expressive TTS Dataset for Assamese, Bengali, and Tamil.NeMo
VocabAdaptation_LLM
Love Open Source and this site? Check out how you can help us