• Stars
    star
    10
  • Rank 1,807,345 (Top 36 %)
  • Language
  • Created about 1 year ago
  • Updated 12 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Additional resources from our AACL tutorial

More Repositories

1

indicnlp_catalog

A collaborative catalog of NLP resources for Indic languages
543
star
2

Indic-BERT-v1

Indic-BERT-v1: BERT-based Multilingual Model for 11 Indic Languages and Indian-English. For latest Indic-BERT v2, check: https://github.com/AI4Bharat/IndicBERT
Python
273
star
3

IndicTrans2

Translation models for 22 scheduled languages of India
Python
223
star
4

indicnlp_corpus

Description Describes the IndicNLP corpus and associated datasets
Python
149
star
5

Indic-TTS

Text-to-Speech for languages of India
Jupyter Notebook
130
star
6

indicTrans

indicTranslate v1 - Machine Translation for 11 Indic languages. For latest v2, check: https://github.com/AI4Bharat/IndicTrans2
Jupyter Notebook
111
star
7

OpenHands

👐OpenHands : Making Sign Language Recognition Accessible. | **NOTE:** No longer actively maintained. If you are interested to own this and take it forward, please raise an issue
Python
97
star
8

Chitralekha

Chitralekha - A video transcreation platform for Indic languages, supporting transcription, translation and voice-over
95
star
9

IndicLLMSuite

A blueprint for creating Pretraining and Fine-Tuning datasets for Indic languages
Python
89
star
10

IndicWav2Vec

Pretraining, fine-tuning and evaluation scripts for Indic-Wav2Vec2
Jupyter Notebook
74
star
11

IndicXlit

Transliteration models for 21 Indic languages
Python
68
star
12

NPTEL2020-Indian-English-Speech-Dataset

NPTEL2020: Speech2Text dataset for Indian-English Accent
Python
68
star
13

IndicBERT

Pretraining, fine-tuning and evaluation scripts for IndicBERT-v2 and IndicXTREME
Python
65
star
14

IndicNLP-Transliteration

Codebase for Indic-Transliteration using Seq2Seq RNN. For latest repo with Transformer-based models, check: https://github.com/AI4Bharat/IndicXlit
Python
58
star
15

Shoonya

Shoonya - Platform to Annotate and label data at scale.
50
star
16

vistaar

Vistaar: Diverse Benchmarks and Training Sets for Indian Language ASR
Python
43
star
17

indic-bart

Pre-trained, multilingual sequence-to-sequence models for Indian languages
Python
43
star
18

Chitralekha-Backend

Transcribe your videos and translate it into Indic languages.
Python
27
star
19

Indic-Input-Tool-UI

Web Interface for Transliteration for Indic languages.
JavaScript
22
star
20

Shoonya-Backend

DRF-based API server for Shoonya platform
Python
20
star
21

Svarah

Swarah: Indian-English speech dataset collected across the country
Python
20
star
22

IndicVoices-R

A Massive Multilingual Multi-speaker Speech Corpus for Scaling Indian TTS
19
star
23

FBI

FBI: Finding Blindspots in LLM Evaluations with Interpretable Checklists
Python
18
star
24

Shoonya-Frontend

JavaScript
16
star
25

Dhruva-Platform

Dhruva is an open-source platform for serving language AI models at scale.
TypeScript
15
star
26

indic-asr-api-backend

Indic-Conformer models for ASR
Python
13
star
27

INCLUDE

Code for INCLUDE paper with pre-trained models
Python
13
star
28

DocSim

Synthetically generate random text document images with ground-truth
Python
11
star
29

Fonts-for-Indian-Scripts

Font style transfer for Devanāgarī script using GANs
Python
10
star
30

adapter-efficiency

Python
10
star
31

IndicLID

Language Identification for Indian languages
Python
9
star
32

setu

Setu is a comprehensive pipeline designed to clean, filter, and deduplicate diverse data sources including Web, PDF, and Speech data. Built on Apache Spark, Setu encompasses four key stages: document preparation, document cleaning and analysis, flagging and filtering, and deduplication.
HTML
9
star
33

speech-transcript-cleaning

Perform cleaning and normalization to standardize speech transcripts (train and test) across datasets.
Python
8
star
34

ezAnnotate

Annotation Platform for Machine Learning / Data Science, forked from DataTurks
JavaScript
7
star
35

Anudesh-Frontend

JavaScript
7
star
36

Chitralekha-Frontend

Frontend for Chitralekha platform
JavaScript
7
star
37

transactional-voice-ai

The code for transactional voice AI
Python
6
star
38

Indic-Glossary-Explorer

Glossary service for Indian languages
JavaScript
6
star
39

workshop-nlg-nlu-2022

Material for AI Workshop on Natural Language Understanding and Generation
6
star
40

indicnlp.ai4bharat.org

Archived old website for AI4Bhārat Indic-NLP
HTML
5
star
41

Chitralekha-Frontend-Lite

Lightweight version of Chitralekha
JavaScript
5
star
42

Indic-Glossaries

Collection of datasets for glossaries in Indian languages
4
star
43

sign-language.ai4bharat.org

Website for Indian Sign Language Recognition
4
star
44

INCLUDE-MS-Teams-Integration

An experimental Microsoft Teams integration of Sign Language models for word-level sign recognition
C#
4
star
45

Anudesh-Backend

Python
4
star
46

IndicMT-Eval

IndicMT Eval: A Dataset to Meta-Evaluate Machine Translation Metrics for Indian Languages, ACL 2023
HTML
4
star
47

IndicVoices

Jupyter Notebook
4
star
48

indic-numtowords

A simple lightweight library for text normalization for Indian Languages
Python
4
star
49

IndicSUPERB

Python
3
star
50

transactional-voice-ai_serving

Deployment code for all the Transactional Voice AI modules.
C++
3
star
51

CTQScorer

Python
3
star
52

Indic-Swipe

IndicSwipe is a collection of datasets and neural model architectures for decoding swipe gesture inputs on touch-based Indic language keyboards across 7 languages.
Python
3
star
53

Indic-OCR

2
star
54

DMU-DataDaan

Codebase for NLTM DMU's Data Upload System
JavaScript
2
star
55

2022.ai4bharat.org

Old website of AI4Bhārat using TinaCMS
JavaScript
2
star
56

setu-translate

Python
2
star
57

models.ai4bharat.org

A one stop platform to try out all the models built by the AI4Bharat team.
JavaScript
2
star
58

Shoonya-Frontend-Old

Old version of Shoonya UI. Latest repo: https://github.com/AI4Bharat/Shoonya-Frontend
JavaScript
2
star
59

Varnam-Transliteration-UI

Transliteration Web Interface
JavaScript
1
star
60

ai4b-website

TypeScript
1
star
61

Dhruva-Evaluation-Suite

A tool to perform functional testing and performance testing of the Dhruva Platform
Python
1
star
62

indicnlp_suite

Natural Language Understanding resources for Indian languages
1
star
63

Input-Tools-By-AI4bharat

Enhance your typing experience in Chrome with AI4Bharat's Input Tools Chrome extension. This extension provides real-time transliteration suggestions for Indian languages, offering seamless integration into your typing workflow.
JavaScript
1
star
64

Lahaja

This repository holds the artifacts of 'LAHAJA: A Robust Multi-accent Benchmark for Evaluating Hindi ASR Systems'
1
star
65

Rasa

Expressive TTS Dataset for Assamese, Bengali, and Tamil.
Python
1
star
66

NeMo

Python
1
star
67

VocabAdaptation_LLM

Python
1
star