• Stars
    star
    146
  • Rank 252,769 (Top 5 %)
  • Language
    Jupyter Notebook
  • License
    MIT License
  • Created almost 6 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Interface for people to use my model which predicts which techniques one should use to solve a competitive programming problem to get an AC

Code with AI

Code with AI is a tool which predicts which techniques one should use to solve a competitive programming problem to get an AC

Tool: https://code-with-ai.app.render.com/

Usage:

  1. Copy paste the problem statement
  2. Hit Enter :)

Alt Text

  • This tool was very well received by competitive programming community. Check out my blog post on codeforces here

  • This tool was also shared by Jeremy Howard on Twitter. Check out his tweet.

  • Some Analytics: 3300+ unique users have used this tool, till June 2019 since its launch.

Dataset

I scraped the dataset from Codechef and Codeforces, the scripts for which along with the dataset you can find in ulmfit-model/competitive-programming-websites-scripts

The dataset on which model has been trained has 92 classes, which are:

'mobius-function', 'queue', 'greedy', 'suffix-array', 'bitmasks', 'digit dp', 'lca', 'gcd', 'probabilities', 'combinatorics', 'graph matchings', 'easy-medium', 'precomputation', 'sprague-grundy', 'math', 'centroid-decomposition', 'link-cut-tree', 'expression parsing', 'constructive algorithms', 'medium', 'schedules', 'euler tour', 'easy', 'challenge', 'implementation', 'binary search', 'matrices', 'two pointers', 'dfs', 'dp+bitmask', 'sets', 'sqrt-decomposition', 'dijkstra', 'line-sweep', 'data structures', 'tree-dp', 'hard', 'mst', 'recursion', 'games', 'suffix-trees', 'kmp', 'stack', 'brute force', 'medium-hard', 'prefix-sum', 'graphs', '2-sat', 'shortest paths', 'heavy-light', 'heaps', '*special problem', 'trees', 'array', 'sliding-window', 'inclusion-exclusion', 'meet-in-the-middle', 'dfs and similar', 'sortings', 'pigeonhole', 'xor', 'gaussian-elimination', 'lucas theorem', 'divide and conquer', 'flows', 'strings', 'matrix-expo', 'number theory', 'bipartite', 'knapsack', 'sieve', 'ternary search', 'modulo', 'backtracking', 'treap', 'trie', 'dp', 'fenwick', 'observation', 'fibonacci', 'convex-hull', 'chinese remainder theorem', 'string suffix structures', 'geometry', 'lazy propagation', 'factorization', 'dsu', 'fft', 'segment-tree', 'hashing', 'bfs', 'prime', 'mo algorithm'

These classes are a union of tags which codeforces and codechef have! You can check out more details about data preparation in ulmfit-model/code-with-ai-data-preparation.ipynb

Model

You can check out the code for model in notebook ulmfit-model/code-with-ai.ipynb

Basically, the classifier is built starting from Wikitext103 Language Model pretrained weights, fine-tuning the language model on current dataset and then building classifier on top of fine-tuned language model.

Results

The classifier has a F1 Score of ~49.

TODOs

  • Try improving the score further by using bidirectional RNN

  • Try improving the score further by using an approach similar to DeViSE paper, i.e instead of training the model to predict O or 1, train the model to go closer towards the embedding vector representation of labels - the intuition behind this is that labels in competitive programming like graph, dfs, bfs etc arenโ€™t disjoint.

More Repositories

1

inltk

Natural Language Toolkit for Indic Languages aims to provide out of the box support for various NLP tasks that an application developer might need
Python
815
star
2

nlp-for-hindi

State of the Art Language models and Classifier for Hindi language (spoken in Indian sub-continent)
Jupyter Notebook
120
star
3

nlp-for-sanskrit

State of the Art Language models and Classifier for Sanskrit language (ancient indian language)
Jupyter Notebook
75
star
4

awesome-agriculture

List of project ideas/references which can help engineers build technology for agriculture which can eventually help farmers
54
star
5

nlp-for-tamil

State of the Art Language models and Classifier for Tamil language (spoken in India, and few other South Asian countries)
Jupyter Notebook
52
star
6

nlp-for-malyalam

State of the Art Language models and Classifier for Malayalam, which is spoken by the Malayali people in the Indian state of Kerala and the union territories of Lakshadweep and Puducherry
Jupyter Notebook
36
star
7

nlp-for-nepali

State of the Art Language models and Classifier for Nepali, which is official language of Nepal and one of the official status gained language of India
Jupyter Notebook
30
star
8

nlp-for-bengali

State of the Art Language models and Classifier for Bengali, which is primarily spoken by the Bengalis in South Asia.
Jupyter Notebook
30
star
9

nlp-for-kannada

State of the Art Language models and Classifier for Kannada, which is spoken predominantly by Kannada people in India, mainly in the state of Karnataka
Jupyter Notebook
29
star
10

nlp-for-gujarati

State of the Art Language models and Classifier for Gujarati, which is a language native to the Indian state of Gujarat
Jupyter Notebook
26
star
11

nlp-for-marathi

State of the Art Language models and Classifier for Marathi, which is spoken predominantly by Marathi people of Maharashtra, India
Jupyter Notebook
25
star
12

nlp-for-hinglish

Jupyter Notebook
22
star
13

nlp-for-punjabi

State of the Art Language models and Classifier for punjabi language (spoken in Indian sub-continent)
Jupyter Notebook
14
star
14

nlp-for-odia

State of the Art Language models and Classifier for Odia, which is spoken in the Indian state of Odisha
Jupyter Notebook
11
star
15

nlp-for-manglish

State of the Art Language models and Classifier for Code mixed Manglish (Malayalam and English) - spoken in Indian sub-continent.
Jupyter Notebook
8
star
16

nlp-for-tanglish

State of the Art Language models and Classifier for Code mixed Tanglish (Tamil and English) - spoken in Indian sub-continent.
Jupyter Notebook
5
star
17

indian-language-classifier

Classifier to distinguish which Indian Language a given text contains
Jupyter Notebook
5
star
18

human-protein-atlas-kaggle-competition

This repository contains my model which was Ranked in Top-17% in Human Protein Atlas Image Classification challenge on Kaggle
Python
3
star
19

isl

Indian sign Language Translation Prototype - Developed during a Hackathon
Jupyter Notebook
3
star
20

ipl-matches-result-prediction

Using Deep Learning to predict IPL Matches Result
2
star
21

goru001.github.io

Liquid
2
star
22

whatssms

Get important messages from Whatsapp groups as SMS on your mobile
Python
1
star
23

stock-market-prediction

Predicting the opening price of stocks using Deep Learning
1
star
24

algorithm-templates

Algorithm Templates for direct usage in competitive-programming.
C++
1
star