• Stars
    star
    2
  • Language
  • Created almost 3 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

More Repositories

1

indicnlp_catalog

A collaborative catalog of NLP resources for Indic languages
543
star
2

Indic-BERT-v1

Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages and Indian-English. For latest Indic-BERT v2, check: https://github.com/AI4Bharat/IndicBERT
Python
273
star
3

IndicTrans2

Translation models for 22 scheduled languages of India
Python
223
star
4

indicnlp_corpus

Description Describes the IndicNLP corpus and associated datasets
Python
149
star
5

Indic-TTS

Text-to-Speech for languages of India
Jupyter Notebook
130
star
6

indicTrans

indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2
Jupyter Notebook
111
star
7

OpenHands

👐OpenHands : Making Sign Language Recognition Accessible. | **NOTE:** No longer actively maintained. If you are interested to own this and take it forward, please raise an issue
Python
97
star
8

Chitralekha

Chitralekha - A video transcreation platform for Indic languages, supporting transcription, translation and voice-over
95
star
9

IndicLLMSuite

A blueprint for creating Pretraining and Fine-Tuning datasets for Indic languages
Python
89
star
10

IndicWav2Vec

Pretraining, fine-tuning and evaluation scripts for Indic-Wav2Vec2
Jupyter Notebook
74
star
11

IndicXlit

Transliteration models for 21 Indic languages
Python
68
star
12

NPTEL2020-Indian-English-Speech-Dataset

NPTEL2020: Speech2Text dataset for Indian-English Accent
Python
68
star
13

IndicBERT

Pretraining, fine-tuning and evaluation scripts for IndicBERT-v2 and IndicXTREME
Python
65
star
14

IndicNLP-Transliteration

Codebase for Indic-Transliteration using Seq2Seq RNN. For latest repo with Transformer-based models, check: https://github.com/AI4Bharat/IndicXlit
Python
58
star
15

Shoonya

Shoonya - Platform to Annotate and label data at scale.
50
star
16

vistaar

Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR
Python
43
star
17

indic-bart

Pre-trained, multilingual sequence-to-sequence models for Indian languages
Python
43
star
18

Chitralekha-Backend

Transcribe your videos and translate it into Indic languages.
Python
27
star
19

Indic-Input-Tool-UI

Web Interface for Transliteration for Indic languages.
JavaScript
22
star
20

Shoonya-Backend

DRF-based API server for Shoonya platform
Python
20
star
21

Svarah

Swarah: Indian-English speech dataset collected across the country
Python
20
star
22

IndicVoices-R

A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS
19
star
23

FBI

FBI: Finding Blindspots in LLM Evaluations with Interpretable Checklists
Python
18
star
24

Shoonya-Frontend

JavaScript
16
star
25

Dhruva-Platform

Dhruva is an open-source platform for serving language AI models at scale.
TypeScript
15
star
26

indic-asr-api-backend

Indic-Conformer models for ASR
Python
13
star
27

INCLUDE

Code for INCLUDE paper with pre-trained models
Python
13
star
28

DocSim

Synthetically generate random text document images with ground-truth
Python
11
star
29

Fonts-for-Indian-Scripts

Font style transfer for Devanāgarī script using GANs
Python
10
star
30

aacl23-mnmt-tutorial

Additional resources from our AACL tutorial
10
star
31

adapter-efficiency

Python
10
star
32

IndicLID

Language Identification for Indian languages
Python
9
star
33

setu

Setu is a comprehensive pipeline designed to clean, filter, and deduplicate diverse data sources including Web, PDF, and Speech data. Built on Apache Spark, Setu encompasses four key stages: document preparation, document cleaning and analysis, flagging and filtering, and deduplication.
HTML
9
star
34

speech-transcript-cleaning

Perform cleaning and normalization to standardize speech transcripts (train and test) across datasets.
Python
8
star
35

ezAnnotate

Annotation Platform for Machine Learning / Data Science, forked from DataTurks
JavaScript
7
star
36

Anudesh-Frontend

JavaScript
7
star
37

Chitralekha-Frontend

Frontend for Chitralekha platform
JavaScript
7
star
38

transactional-voice-ai

The code for transactional voice AI
Python
6
star
39

Indic-Glossary-Explorer

Glossary service for Indian languages
JavaScript
6
star
40

workshop-nlg-nlu-2022

Material for AI Workshop on Natural Language Understanding and Generation
6
star
41

indicnlp.ai4bharat.org

Archived old website for AI4Bhārat Indic-NLP
HTML
5
star
42

Chitralekha-Frontend-Lite

Lightweight version of Chitralekha
JavaScript
5
star
43

Indic-Glossaries

Collection of datasets for glossaries in Indian languages
4
star
44

sign-language.ai4bharat.org

Website for Indian Sign Language Recognition
4
star
45

INCLUDE-MS-Teams-Integration

An experimental Microsoft Teams integration of Sign Language models for word-level sign recognition
C#
4
star
46

Anudesh-Backend

Python
4
star
47

IndicMT-Eval

IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation Metrics for Indian Languages, ACL 2023
HTML
4
star
48

IndicVoices

Jupyter Notebook
4
star
49

indic-numtowords

A simple lightweight library for text normalization for Indian Languages
Python
4
star
50

IndicSUPERB

Python
3
star
51

transactional-voice-ai_serving

Deployment code for all the Transactional Voice AI modules.
C++
3
star
52

CTQScorer

Python
3
star
53

Indic-Swipe

IndicSwipe is a collection of datasets and neural model architectures for decoding swipe gesture inputs on touch-based Indic language keyboards across 7 languages.
Python
3
star
54

DMU-DataDaan

Codebase for NLTM DMU's Data Upload System
JavaScript
2
star
55

2022.ai4bharat.org

Old website of AI4Bhārat using TinaCMS
JavaScript
2
star
56

setu-translate

Python
2
star
57

models.ai4bharat.org

A one stop platform to try out all the models built by the AI4Bharat team.
JavaScript
2
star
58

Shoonya-Frontend-Old

Old version of Shoonya UI. Latest repo: https://github.com/AI4Bharat/Shoonya-Frontend
JavaScript
2
star
59

Varnam-Transliteration-UI

Transliteration Web Interface
JavaScript
1
star
60

ai4b-website

TypeScript
1
star
61

Dhruva-Evaluation-Suite

A tool to perform functional testing and performance testing of the Dhruva Platform
Python
1
star
62

indicnlp_suite

Natural Language Understanding resources for Indian languages
1
star
63

Input-Tools-By-AI4bharat

Enhance your typing experience in Chrome with AI4Bharat's Input Tools Chrome extension. This extension provides real-time transliteration suggestions for Indian languages, offering seamless integration into your typing workflow.
JavaScript
1
star
64

Lahaja

This repository holds the artifacts of 'LAHAJA: A Robust Multi-accent Benchmark for Evaluating Hindi ASR Systems'
1
star
65

Rasa

Expressive TTS Dataset for Assamese, Bengali, and Tamil.
Python
1
star
66

NeMo

Python
1
star
67

VocabAdaptation_LLM

Python
1
star