• Stars
    star
    130
  • Rank 277,575 (Top 6 %)
  • Language
  • Created over 6 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Source code understanding via Machine Learning techniques

Awesome Source Code Analysis Via Machine Learning Techniques

A list of resources for source code analysis application using Machine Learning techniques (eg, Deep Learning, PCA, SVM, Bayesian, proabilistic models, reinformcement learning techniques etc)

Maintainers - Peter Teoh

Contributing

Please feel free to pull requests, email Peter Teoh ([email protected]) or join our chats to add links.

[Join the chat at https://gitter.im/tthtlc/awesome-source-analysis]

Sharing

Table of Contents

Machine-Learning-Guided Selectively Unsound Static Analysis http://www.seas.upenn.edu/~kheo/home/paper/icse17-heohyi.pdf

A Survey of Machine Learning for Big Code and Naturalness https://arxiv.org/pdf/1709.06182

Ariadne: Analysis for Machine Learning Programs https://arxiv.org/pdf/1805.04058

The use of machine learning with signal- and NLP processing of source code to fingerprint, detect, and classify vulnerabilities and weaknesses with MARFCAT https://arxiv.org/abs/1010.2511

VulDeePecker: A Deep Learning-Based System for Vulnerability Detection https://arxiv.org/pdf/1801.01681

code2vec: Learning Distributed Representations of Code https://arxiv.org/pdf/1803.09473

Automated software vulnerability detection with machine learning https://arxiv.org/abs/1803.04497

Automatic feature learning for vulnerability prediction https://arxiv.org/pdf/1708.02368

Neural Turing Machines https://arxiv.org/pdf/1410.5401.pdf

DeepCoder: Learning to Write Programs https://arxiv.org/abs/1611.01989

Recent Advances in Neural Program Synthesis https://arxiv.org/pdf/1802.02353

Neural-Guided Deductive Search for Real-Time Program Synthesis https://arxiv.org/pdf/1804.01186

RobustFill: Neural Program Learning under Noisy I/O https://arxiv.org/pdf/1703.07469

On End-to-End Program Generation from User Intention by Deep https://arxiv.org/pdf/1510.07211

Neural Program Search: Solving Programming Tasks from Description https://arxiv.org/pdf/1802.04335

A Syntactic Neural Model for General-Purpose Code Generation https://arxiv.org/pdf/1704.01696

Building Machines That Learn and Think Like People https://arxiv.org/pdf/1604.00289

Differentiable Programs with Neural Libraries https://arxiv.org/pdf/1611.02109

Summary-TerpreT: A Probabilistic Programming Language for Program Induction https://arxiv.org/pdf/1612.00817

Auto-Documenation for Software Development https://arxiv.org/pdf/1701.08485

BOOK: Storing Algorithm-Invariant Episodes for Deep Reinforcement Learning https://arxiv.org/pdf/1709.01308

Boda-RTC: Productive Generation of Portable, Efficient Code ... https://arxiv.org/pdf/1606.00094

Making Neural Programming Architectures Generalize via Recursion https://arxiv.org/pdf/1704.06611

Differentiable Functional Program Interpreters https://arxiv.org/pdf/1611.01988

Utilizing Static Analysis and Code Generation to Accelerate https://arxiv.org/pdf/1206.6466

Deep Probabilistic Programming Languages: A Qualitative Study https://arxiv.org/pdf/1804.06458

BinPro: A Tool for Binary Source Code Provenance https://arxiv.org/pdf/1711.00830

A Survey on Compiler Autotuning using Machine Learning https://arxiv.org/pdf/1801.04405

Estimating defectiveness of source code: A predictive model using GitHub content https://arxiv.org/pdf/1803.07764

EMBER: An Open Dataset for Training Static PE Malware Machine https://arxiv.org/pdf/1804.04637

On End-to-End Program Generation from User Intention by Deep Neural Networks https://arxiv.org/pdf/1510.07211

Utilizing Static Analysis and Code Generation to Accelerate Neural Networks https://arxiv.org/abs/1206.6466

DLPaper2Code: Auto-generation of Code from Deep Learning Research Paper https://arxiv.org/pdf/1711.03543

Inferring Generative Model Structure with Static Analysis https://arxiv.org/pdf/1709.02477

Sorting and Transforming Program Repair Ingredients via Deep Learning Code Similarities https://arxiv.org/pdf/1707.04742

DeepAPT: Nation-State APT Attribution Using End-to-End Deep Neural Networks https://arxiv.org/pdf/1711.09666

Automatic Structure Discovery for Large Source Code https://arxiv.org/pdf/1202.3335

Comment Generation for Source Code: Survey https://arxiv.org/pdf/1802.02971

Towards Reverse-Engineering Black-Box Neural Networks https://arxiv.org/abs/1711.01768

Database Reverse Engineering based on Association Rule Mining https://arxiv.org/pdf/1004.3272.pdf

Automated detection and classification of cryptographic algorithms in binary programs through machine learning https://arxiv.org/pdf/1503.01186

Automatically Generating Commit Messages from Diffs using Neural Machine Translation https://arxiv.org/pdf/1708.09492

When Coding Style Survives Compilation: De-anonymizing Programmers from Executable https://arxiv.org/pdf/1512.08546

Code smells https://arxiv.org/pdf/1802.06063

Data Driven Exploratory Attacks on Black Box Classifiers in Adversarial Domains https://arxiv.org/pdf/1703.07909

pix2code: Generating Code from a Graphical User Interface Screenshot https://arxiv.org/pdf/1705.07962

Deep Learning in Software Engineering https://arxiv.org/pdf/1805.04825

Predicting Software Defects Through SVM: An Empirical Approach https://arxiv.org/pdf/1803.03220

A Survey of Reverse Engineering and Program Comprehension https://arxiv.org/pdf/cs/0503068

https://www.owasp.org/images/7/72/OWASP_Top_10-2017_%28en%29.pdf.pdf

https://arxiv.org/pdf/1709.07101.pdf

https://arxiv.org/pdf/1805.05206.pdf

https://arxiv.org/pdf/1807.09160.pdf

https://arxiv.org/pdf/1806.07336.pdf

Or just search arxiv.org (inaccuracies in identifying papers expected): recent arxiv.org search

LLVM based vulnerabilities search

As an extension

https://ml4code.github.io/

(this site being an offshoot of the paper: https://arxiv.org/abs/1709.06182)