• This repository has been archived on 09/Jul/2021
  • Stars
    star
    119
  • Rank 297,930 (Top 6 %)
  • Language
    HTML
  • License
    BSD 3-Clause "New...
  • Created over 8 years ago
  • Updated over 8 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Repository for the code of the "A Convolutional Attention Network for Extreme Summarization of Source Code" paper

Convolutional Attention Network

Code related to the paper:

@inproceedings{allamanis2016convolutional,
  title={A Convolutional Attention Network for Extreme Summarization of Source Code},
  author={Allamanis, Miltiadis and Peng, Hao and Sutton, Charles},
  booktitle={International Conference on Machine Learning (ICML)},
  year={2016}
}

For more information and the data of the paper, see here.

The project depends on Theano and uses Python 2.7.

Usage Instructions

To train the copy_attention model with the data use

> python copy_conv_rec_learner.py <training_file> <max_num_epochs> <D> <test_file>

were D is the embedding space dimenssion (128 in paper.) The best model will be saved at <training_file>.pkl

To evaluate an existing model re-run with exactly the same parameteres except for <max_num_epochs> which should be zero.

The following code will generate names from a pre-trained model and a test_file with code examples.

model = ConvolutionalCopyAttentionalRecurrentLearner.load(model_fname)
test_data, original_names = model.naming_data.get_data_in_recurrent_copy_convolution_format(test_file, model.padding_size)
test_name_targets, test_code_sentences, test_code, test_target_is_unk, test_copy_vectors = test_data

idx = 2  # pick an example from test_file
res = model.predict_name(np.atleast_2d(test_code[idx]))
print "original name:", ' '.join(original_names[idx].split(','))
print "code:", ' '.join(test_code[idx])
print "generated names:"
for r,v in res:
    print v, ' '.join(r)

More Repositories

1

OpenVocabCodeNLM

Contains the code for our ICSE 2020 paper: Big Code != Big Vocabulary: Open-Vocabulary Language Models for Source Code and for its earlier pre-print: Maybe Deep Neural Networks are the Best Choice for Modeling Source Code (https://arxiv.org/abs/1903.05734). This is the first open vocabulary language model for code that uses the byte pair encoding algorithm (BPE) to learn a segmentation of code tokens into subword units.
Python
83
star
2

naturalize

Source code for the Naturalize project
Java
56
star
3

api-mining

Probabilistic API Mining
Java
53
star
4

sequence-mining

Probabilistic Sequence Mining
Java
44
star
5

tassal

Tree-based Autofolding Software Summarization Algorithm
Java
42
star
6

eqnet

Code related to "Learning Continuous Semantic Representations of Symbolic Expressions" project.
Python
36
star
7

mineSStuBs

Hosts our tool for mining simple "stupid'' bugs (SStuBs).
Java
35
star
8

codemining-core

A set of tools for extracting tokens and ASTs from code
Java
22
star
9

itemset-mining

Probabilistic Itemset Mining
Java
19
star
10

codemining-treelm

Tree Language Models
Java
9
star
11

clams

CLAMS API Summarizer
Python
8
star
12

mast-group.github.io

MAST Group Website
HTML
4
star
13

codemining-utils

Utility classes for serialization, parameter loading, sampling and math
Java
4
star
14

codemining-sequencelm

Sequential Language Models
Java
4
star
15

variable-naming-challenge

Source code related to the variable naming challenge
Python
4
star
16

commitmining-tools

A set of tools for traversing a Git repository and possibly its files
Java
3
star
17

js-analyser

Javascript analyser using Node and Esprima
JavaScript
2
star
18

maven-repo

Maven repository for jars not on maven central
Python
2
star
19

js-random-tester

JS Random testing tool and new Definition File creator using old versions
JavaScript
2
star
20

nlptools

A set of NLP tools that may be useful when processing text
Java
1
star
21

DeepSStuBs

DeepSStuBs is a framework for learning single statement bug detectors from an existing code corpus.
JavaScript
1
star
22

js-analyser-util

Util package to analyse instrumented and collected data from Node.JS projects
Java
1
star