• Stars
    star
    186
  • Rank 207,316 (Top 5 %)
  • Language
    Python
  • License
    GNU General Publi...
  • Created over 9 years ago
  • Updated over 7 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A tool to predict vulnerability discovery of binary only programs

VDiscover

VDiscover is a tool designed to train a vulnerability detection predictor. Given a vulnerability discovery procedure and a large enough number of training testcases, it extracts lightweight features to predict which testcases are potentially vulnerable. This repository contains an improved version of a proof-of-concept used to show experimental results in our technical report (available here).

Use cases

VDiscover aims to be used when there is a large amount of testcases to analyze using a costly vulnerability detection procedure. It can be trained to provide a quick prioritization of testcases. The extraction of features to perform a prediction is designed to be scalable. Nevertheless, this implementation is not particularly optimized so it should easy to improve the performance of it.

Requirements

Trace extraction is working only in x86 (x86_64 support should be simple to extend and it is planned)

Quickstart

Before starting, it is recommended to manually install binutils, scikit-learn and setuptools (to perform a local installation). For instance, in Ubuntu/Debian:

# apt-get install python-numpy python-matplotlib python-setup python-scipy

Then we can execute:

git clone https://github.com/CIFASIS/VDiscover.git
cd VDiscover
python setup.py install --user

By default, the local installation of the command line utilities of VDiscover is performed inside ~/.local/bin, so it is recommended to add this directory into the PATH variable. Our tool is composed by two main components:

  • fextractor: to extract dynamic and static features from test cases.
  • vpredictor: to train a new vulnerability prediction model or predict using a previously trained one. It can be used to cluster and visualize a set of test cases.

Some examples of testcases of very popular programs (grep, gzip, bc, ..) can be found in examples/testcases. For example, to extract raw dynamic features from an execution of bc:

fextractor --dynamic bc 

And the resulted extracted features are:

/usr/bin/bc	isatty:0=Num32B0 isatty:0=Num32B8 setvbuf:0=Ptr32 setvbuf:1=NPtr32 setvbuf:2=Num32B8 setvbuf:3=Num32B0 ...

This raw data can be used to train a new vulnerability prediction model or predict using a previously trained one. Additionally, more detailed (but outdated) documentation is available here.

License

GPL3

More Repositories

1

QuickFuzz

An experimental grammar fuzzer in Haskell using QuickCheck
Haskell
198
star
2

neural-fuzzer

Python
90
star
3

nosy-newt

Nosy Newt is a simple concolic execution tool for exploring the input space of a binary executable program based in Triton
Python
60
star
4

dense-sptam

Dense S-PTAM
C++
60
star
5

gnss-stereo-inertial-fusion

GNSS-Stereo-Inertial SLAM implementation that fuses GNSS, visual and inertial measurements using a tightly-coupled approach.
C++
36
star
6

slam_agricultural_evaluation

Shell
33
star
7

splitting_gan

Code for Class-Splitting Generative Adversarial Networks
Python
32
star
8

qss-solver

Modeling and simulation tool for continuous and hybrid systems.
C
28
star
9

object-detection-sptam

Online Object Detection and Localization on Stereo Visual SLAM System
Jupyter Notebook
25
star
10

OS-fuzzing

Using Machine Learning to predict the outcome of a zzuf fuzzing campaign
Python
24
star
11

exploiting-gan-internal-capacity

Code for reproducing experiments in "Exploiting GAN Internal Capacity for High-Quality Reconstruction of Natural Images"
Python
16
star
12

dataset-processing

Tools to process the Weed removing robot dataset
Python
15
star
13

wganvo

WGANVO: Monocular Visual Odometry based on WGAN
Python
9
star
14

megadeth

MEga DErivation with Template Haskell
Haskell
8
star
15

modelicacc

Modelica C Compiler implemented in C++ to develop and test novel algorithms for large scale models.
C++
8
star
16

distributed-sptam

Distributed S-PTAM
C++
8
star
17

basalt-with-persistent-map

C++
5
star
18

spp_estimation

Seed-per-pod estimation for plant breeding using deep learning
Python
5
star
19

power-devs

PowerDEVS is an integrated tool for hybrid systems modeling and simulation based on the DEVS formalism.
C++
5
star
20

sb-graph

Set Based Graph Library
C++
4
star
21

ghc-cm

Prototype GHC with Class Morphisms
Haskell
4
star
22

curso-herramientas

Curso de Herramientas de Machine Learning para el Polo Tecnolรณgico
Python
4
star
23

vdiscover-workshop

Python
2
star
24

ORB_SLAM3

C++
2
star
25

svo-2.0

C++
2
star
26

okvis_ros

C++
1
star
27

grotloc

Ground Truth for Loop Closure (GroTLoC)
Python
1
star
28

weed_robot_simulation

Dockerfile
1
star
29

bug-report-scripts

Python
1
star
30

basalt_ros

Shell
1
star
31

certificate

Haskell
1
star
32

rovio

C++
1
star
33

rebvo

MATLAB
1
star
34

FLVIS

C++
1
star
35

msckf_vio

C++
1
star
36

r-vio

C++
1
star
37

VINS-Fusion

C++
1
star
38

Kimera-VIO-ROS

C++
1
star
39

StereoLoopDetector

Code for the article "Addressing the challenges of loop detection in agricultural environments" published in the Journal of Field Robotics, 2024.
1
star