• Stars
    star
    201
  • Rank 194,491 (Top 4 %)
  • Language
    C++
  • License
    Apache License 2.0
  • Created almost 10 years ago
  • Updated over 3 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Learning framework for program property prediction

Nice2Predict

Learning framework for program property prediction.

This is a backend for tools such as JSNice (http://jsnice.org/) for JavaScript that can predict program properties such as variable names and types. This backend is designed to extend the tool to multiple programming languages. For this reason, the machine learning machinery is extracted in this tool.

To get a complete tool, one must include a parses for each programming language of interest and train on a lot of code.

We have included an example frontend for JavaScript deminification at http://github.com/eth-srl/UnuglifyJS . This tool work together with the Nice2Server

Compiling

To compile, first install dependencies

on Ubuntu:

sudo apt-get install libmicrohttpd-dev libcurl4-openssl-dev bazel libgoogle-glog-dev libgflags-dev

on Mac:

brew tap caskroom/versions brew cask install java8 brew install libmicrohttpd bazel glog gflags

on Windows follow any installation instructions and install libmicrohttpd, curl and bazel.

[Optional] Install Google Performance Tools:

  1. libunwind: http://download.savannah.gnu.org/releases/libunwind/libunwind-1.1.tar.gz

  2. gperftools: https://code.google.com/p/gperftools/

Finally, call

bazel build //...

To run tests, call

bazel test //...

Training

Run:

bazel run //src/training/train

To get options for training, use:

bazel run //src/training/train --help

By default, train gets input programs (converted to JSON for example with UnuglifyJS) from the file testdata in the current directory. As a result, it creates files with the trained model.

If you wish to train the model using pseudolikelihood use the following parameters:

bazel run //src/training/train -- -training_method pl -input path/to/input/file --logtostderr

you can control the pseudolikelihood specific beam size with the -beam_size parameter which is different from the beam size used during MAP Inference.

//src/training/train expects data to be in protobuf recordIO format. If you want to use JSON input - use //src/training/train_json instead.

Factors

by default the usage of factor features in Nice2Predict is enabled, however if you wish to disable it you can launch the training with the following command:

bazel run //src/training/train -- -use_factors=false -input path/to/input/file --logtostderr

Predicting properties

To predict properties for new programs, start a server after a model was trained:

bazel run //src/server/nice2serverproto -- --logtostderr

To run old JsonRPC API:

bazel run //src/server/nice2server -- --logtostderr

One can debug and observe deobfuscation from the viewer available in the viewer/viewer.html .

More Repositories

1

lmql

A language for constraint-guided and efficient LLM programming.
Python
3,619
star
2

silq

Q#
608
star
3

securify2

Securify v2.0
Solidity
587
star
4

debin

Machine Learning to Deobfuscate Binaries
Python
412
star
5

eran

ETH Robustness Analyzer for Deep Neural Networks
Python
313
star
6

diffai

A certifiable defense against adversarial examples by training neural networks to be provably robust
Python
217
star
7

securify

[DEPRECATED] Security Scanner for Ethereum Smart Contracts
Java
215
star
8

language-model-arithmetic

Controlled Text Generation via Language Model Arithmetic
Python
201
star
9

ilf

AI based fuzzer based on imitation learning
Python
149
star
10

ELINA

ELINA: ETH LIbrary for Numerical Analysis
C++
129
star
11

psi

Exact Inference Engine for Probabilistic Programs
JetBrains MPS
123
star
12

sven

Python
95
star
13

dl2

DL2 is a framework that allows training neural networks with logical constraints over numerical values in the network (e.g. inputs, outputs, weights) and to query networks for inputs fulfilling a logical formula.
Python
82
star
14

zkay

A programming language and compiler which enable automatic compilation of intuitive data privacy specifications to NIZK-enabled private smart contracts.
Python
81
star
15

astarix

AStarix: Fast and Optimal Sequence-to-Graph Aligner
C++
72
star
16

TFix

JavaScript
66
star
17

fastsmt

Learning to Solve SMT Formulas Fast
SMT
63
star
18

learch

C++
38
star
19

llmprivacy

Python
36
star
20

soltix

SOLTIX: Scalable automated framework for testing Solidity compilers.
Java
33
star
21

ChatProtect

This is the code for the paper "Self-contradictory Hallucinations of Large Language Models: Evaluation, Detection and Mitigation".
Python
33
star
22

probabilistic-forecasts-attacks

Python
30
star
23

colt

Convex Layerwise Adversarial Training (COLT)
Python
29
star
24

SafeCoder

Python
27
star
25

lcifr

Learning Certified Individually Fair Representations
Python
24
star
26

adaptive-auto-attack

Python
23
star
27

dp-sniper

A machine-learning-based tool for discovering differential privacy violations in black-box algorithms.
Python
23
star
28

verx-benchmarks

20
star
29

lamp

LAMP: Extracting Text from Gradients with Language Model Priors (NeurIPS '22)
Python
20
star
30

dp-finder

Differential Privacy Testing System
Python
19
star
31

bayonet

Probabilistic Computer Network Analysis
D
18
star
32

phoenix

Private and Reliable Neural Network Inference (CCS '22)
C++
18
star
33

fnf

Python
16
star
34

EventRacer

A race detection tool for event driven applications.
C++
16
star
35

learning-real-bug-detector

Python
16
star
36

lassi

Latent Space Smoothing for Individually Fair Representations (ECCV 2022)
Python
15
star
37

deepg

Certifying Geometric Robustness of Neural Networks
Python
15
star
38

vscode-silq

TypeScript
15
star
39

zapper

Rust
15
star
40

robust-code

Adversarial Robustness for Code
Python
13
star
41

watermark-stealing

Watermark Stealing in Large Language Models (ICML '24)
Python
13
star
42

guiding-synthesizers

Guiding Program Synthesis by Learning to Generate Examples
Python
12
star
43

learning-to-configure-networks

[NeurIPS'22] Learning to Configure Computer Networks with Neural Algorithmic Reasoning
12
star
44

SABR

Python
11
star
45

bayes-framework-leakage

Python
11
star
46

smoothing-ensembles

[ICLR 2022] Boosting Randomized Smoothing with Variance Reduced Classifiers
Python
11
star
47

UniversalCertificationTheory

Universal Approximation with Certified Networks
Python
10
star
48

llm-quantization-attack

Python
10
star
49

eth-sri.github.io

SRI Group Website
HTML
9
star
50

ModelsPHOG

Synthesized models for PHOG to make the results reproducible by the research community
C++
9
star
51

segmentation-smoothing

Provable robustness for segmentation tasks.
9
star
52

3dcertify

3DCertify is the first verifier to certify robustness of point cloud models against semantic transformations and point perturbations
Python
8
star
53

prover

Verifier for Deep Neural Network Audio Processing
Python
7
star
54

proof-sharing

CAV'22 paper to speed up Neural Network Verification.
Python
7
star
55

mn-bab

[ICLR 2022] Complete Verification via Multi-Neuron Relaxation Guided Branch-and-Bound
Python
7
star
56

ACE

Python
7
star
57

DFENCE

Dynamic Analysis and Synthesis System for Relaxed Memory Models
C++
6
star
58

Delta-Siege

Python
6
star
59

automated-error-analysis

Automated Classification of Model Errors on ImageNet (NeurIPS 2023)
Jupyter Notebook
6
star
60

R4

C++
5
star
61

drs

[NeurIPS 2022] (De-)Randomized Smoothing for Decision Stump Ensembles
Terra
4
star
62

paradox

On the Paradox of Certified Training (TMLR 10/2022)
Python
4
star
63

fare

FARE: Provably Fair Representation Learning with Practical Certificates (ICML '23)
Shell
4
star
64

Unqomp

Automated Uncomputation for Quantum Programs
Python
4
star
65

fairness-feedback-nlp

Human-Guided Fair Classification for NLP (ICLR 2023, Spotlight)
Python
4
star
66

Spire

C#
3
star
67

TAPS

Python
3
star
68

inferui

InferUI: Robust Relational Layouts Synthesis from Examples for Android
C++
3
star
69

abstraqt

OpenQASM
3
star
70

transformation-smoothing

Randomized Smoothing for Parametric (Image) Transformations
Python
3
star
71

cuts

Python
3
star
72

ACES

[SRML@ICLR 2022] Robust and Accurate -- Compositional Architectures for Randomized Smoothing
Python
2
star
73

synthetiq

OpenQASM
2
star
74

DeepT

Python
2
star
75

ncm

Trace Based Supervision for Neural Architectures
2
star
76

malicious-contamination

Python
2
star
77

CRAFT

Python
1
star
78

fedavg_leakage

Python
1
star
79

Reqomp

Python
1
star
80

ibp-propagation-tightness

Python
1
star
81

tableak

TabLeak: Tabular Data Leakage in Federated Learning
1
star
82

domino

1
star