• Stars
    star
    285
  • Rank 145,115 (Top 3 %)
  • Language
    Python
  • License
    Other
  • Created about 6 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

🧠 A tool that makes AI easier.

ergo

Release Software License

ergo (from the Latin sentence "Cogito ergo sum") is a command line tool that makes machine learning with Keras easier.

It can be used to:

  • scaffold new projects in seconds and customize only a minimum amount of code.
  • encode samples, import and optimize CSV datasets and train the model with them.
  • visualize the model structure, loss and accuracy functions during training.
  • determine how each of the input features affects the accuracy by differential inference.
  • export a simple REST API to use your models from a server.

Installing

sudo pip3 install ergo-ai

Installing from Sources

git clone https://github.com/evilsocket/ergo.git
cd ergo
sudo pip3 install -r requirements.txt
python3 setup.py build
sudo python3 setup.py install

Enable GPU support (optional)

Make sure you have CUDA 11 and cuDNN 8.0 installed and then:

sudo pip3 uninstall tensorflow
sudo pip3 install tensorflow-gpu

Example Projects

Usage

To print the general help menu:

ergo help

To print action specific help:

ergo <action> -h

Start by printing the available actions by running ergo help, you can also print the software version (ergo, keras and tensorflow versions) and some hardware info with ergo info to verify your installation.

Creating a Project

Once ready, create a new project named example (ergo create -h to see how to customize the initial model):

ergo create example

Inside the newly created example folder, there will be three files:

  1. prepare.py, used to preprocess your dataset and inputs (if, for instance, you're using pictures instead of a csv file).
  2. model.py, that you can change to customize the model.
  3. train.py, for the training algorithm.

By default, ergo will simply read the dataset as a CSV file, build a small neural network with 10 inputs, two hidden layers of 30 neurons each and 2 outputs and use a pretty standard training algorithm.

Exploration (optional)

Explore properties of the dataset. Ergo can generate graphs and tables that can be useful for the feature engineering of the problem.

Explore can show:

  1. Metrics of each feature (min, max, standard deviation) - Which can be used to discard constant features in the dataset.
  2. Feature correlation of each feature with the target - Which can give an idea of how good is feature is as a linear predictor.
  3. Feature correlation matrix.
  4. PCA decomposition:
    • 2D projection of the data based on classes.
    • Explained variance of each principal component with 90, 95 and 99 % explanation values.
  5. Kmeans clustering or DBSCAN clustering of the data.
  6. Elbow method to determine the optimal number of clusters for kmeans.

Example with a dataset some/path/data.csv:

ergo explore example --dataset some/path/data.csv -p

This will show the PCA decomposition of the dataset, saving (and optionally showing) the explained variance vs the number of principal component vectors used and the 2D projection of the dataset (colored by labels).

A full exploratory analysis can be performed using the --all flag:

ergo explore example --dataset some/path/data.csv --all 

Encoding (optional)

In case you implemented the prepare_input function in the prepare.py script, ergo can be used to encode raw samples, being them executables, images, strings or whatever, into vectors of scalars that are then saved into a dataset.csv file suitable for training

Example with a folder /path/to/data which contains a pos and neg subfolders, in auto labeling mode each group of sample is labeled with its parent directory name:

ergo encode example /path/to/data

Example with a single folder and manual labeling:

ergo encode example /path/to/data --label 'some-label'

Example with a single text file containing multiple inputs, one per line:

ergo encode example /path/to/data --label 'some-label' -m

Training

After defining the model structure and the training process, you can import a CSV dataset (first column must be the label) and start training using 2 GPUs:

ergo train example --dataset /some/path/data.csv --gpus 2

This will split the dataset into a train, validation and test sets (partitioned with the --test and --validation arguments), start the training and once finished show the model statistics.

If you want to update a model and/or train it on already imported data, you can simply:

ergo train example --gpus 2

Testing

Now it's time to visualize the model structure and how the the accuracy and loss metrics changed during training (requires sudo apt-get install graphviz python3-tk):

ergo view example

If the data-test.csv file is still present in the project folder (ergo clean has not been called yet), ergo view will also show the ROC curve.

You can use the relevance command to evaluate the model on a given set (or a subset of it, see --ratio 0.1) by nulling one attribute at a time and measuring how that influenced the accuracy (feature.names is an optional file with the names of the attributes, one per line):

ergo relevance example --dataset /some/path/data.csv --attributes /some/path/feature.names --ratio 0.1

Once you're done, you can remove the train, test and validation temporary datasets with:

ergo clean example

Inference

To load the model and start a REST API for evaluation (can be customized with --address, --port, --classes and --debug options):

ergo serve example

To run an inference on a vector of scalars:

curl "http://localhost:8080/?x=0.345,1.0,0.9,..."

If you customized the prepare_input function in prepare.py (see the Encoding section), you can run an inference on a raw sample:

curl "http://localhost:8080/?x=/path/to/sample"

The input x can also be passed as a POST request:

curl --data 'x=...' "http://localhost:8080/"

Or as a file upload:

curl -F 'x=@/path/to/file' "http://localhost:8080/"

The API can also be used to perform encoding only:

curl -F 'x=@/path/to/file' "http://localhost:8080/encode"

This will return the raw features vector that can be used for inference later.

Other commands

To reset the state of a project (WARNING: this will remove the datasets, the model files and all training statistics):

ergo clean example --all

Evaluate and compare the performances of two trained models on a given dataset and (optionally) output the differences to a json file:

ergo cmp example_a example_b --dataset /path/to/data.csv --to-json diffs.json

Freeze the graph and convert the model to the TensorFlow protobuf format:

ergo to-tf example

Convert the Keras model to frugally-deep format:

ergo to-fdeep example

Optimize a dataset (get unique rows and reuse 15% of the total samples, customize ratio with the --reuse-ratio argument, customize output with --output):

ergo optimize-dataset /some/path/data.csv

License

ergo was made with â™Ĩ by the dev team and it is released under the GPL 3 license.

More Repositories

1

opensnitch

OpenSnitch is a GNU/Linux interactive application firewall inspired by Little Snitch.
Python
10,819
star
2

pwnagotchi

(⌐■_■) - Deep Reinforcement Learning instrumenting bettercap for WiFi pwning.
JavaScript
5,830
star
3

bettercap

DEPRECATED, bettercap developement moved here: https://github.com/bettercap/bettercap
2,507
star
4

xray

XRay is a tool for recon, mapping and OSINT gathering from public networks.
Go
1,966
star
5

bleah

This repository is DEPRECATED, please use bettercap as this tool has been ported to its BLE modules.
1,094
star
6

dnssearch

A subdomain enumeration tool.
Go
879
star
7

arc

A manager for your secrets.
Go
838
star
8

ditto

A tool for IDN homograph attacks and detection.
Go
694
star
9

uroboros

A GNU/Linux monitoring and profiling tool focused on single processes.
Go
670
star
10

shellz

shellz is a small utility to manage your ssh, telnet, kubernetes, winrm, web or any custom shell in a single place.
Go
537
star
11

sg1

A wanna be swiss army knife for data encryption, exfiltration and covert communication.
Go
528
star
12

smali_emulator

This software will emulate a smali source file generated by apktool.
Python
458
star
13

arminject

An application to dynamically inject a shared object into a running process on ARM architectures.
C++
432
star
14

jscythe

Abuse the node.js inspector mechanism in order to force any node.js/electron/v8 based process to execute arbitrary javascript code.
Rust
310
star
15

spycast

A crossplatform mDNS enumeration tool.
HTML
304
star
16

medusa

A fast and secure multi protocol honeypot.
Rust
285
star
17

bettercap-proxy-modules

This repository contains some bettercap transparent proxy example modules.
285
star
18

dirsearch

A Go implementation of dirsearch.
Go
253
star
19

kitsune

🧠 🔎 🤖 Kitsune is an artificial neural network designed to detect and correlate Twitter profiles with similar behaviours.
Python
228
star
20

librestd

A low dependencies and self contained library to create C++ RESTful API services.
C++
199
star
21

shieldwall

zero-trust remote firewall instrumentation
Go
195
star
22

pwnagotchi-plugins-contrib

User contributed Pwnagotchi plugins.
Python
170
star
23

ergo-pe-av

🧠 đŸĻ  An artificial neural network and API to detect Windows malware, based on Ergo and LIEF.
Python
166
star
24

sauron

A minimalistic cross-platform malware scanner with non-blocking realtime filesystem monitoring using YARA rules.
Rust
162
star
25

islazy

A Go library containing a set of opinionated packages, objects, helpers and functions implemented with the KISS principle in mind.
Go
153
star
26

gitstats

Git Repository Analyzer.
Go
145
star
27

gibson

A high performance tree-based cache server.
C
136
star
28

androswat

tool to inspect, dump, modify, search and inject libraries into Android processes.
C++
121
star
29

pdusms

PoC app for raw pdu manipulation on Android.
Java
119
star
30

veryfied

Mark pre-Musk era Twitter actually verified accounts.
JavaScript
113
star
31

ebpf-process-anomaly-detection

Process behaviour anomaly detection using eBPF and unsupervised-learning Autoencoders
Python
108
star
32

pwngrid

(⌐■_■) - API server for pwnagotchi.ai
Go
100
star
33

coffee

Smarter Coffee terminal client.
Python
97
star
34

joe

The Swiss Army knife for backend engineers.
Go
90
star
35

sum

A specialized database server for linear algebra and machine learning.
Go
87
star
36

www.pwnagotchi.ai

(⌐■_■) - pwnagotchi.ai
CSS
84
star
37

nikeplus-fuelband-se-reversed

A a proof of concept application that uses BLE api and the Nike+ FuelBand SE protocol to communicate with Nike BLE devices.
Java
82
star
38

takuan

Takuan is a system service that parses logs and detects noisy attackers in order to build a blacklist database of known cyber offenders.
Go
82
star
39

ftrace

Go library to trace Linux syscalls using the FTRACE kernel framework.
Go
76
star
40

openbank

OpenBank - Your BTC realtime tracker.
PHP
68
star
41

fang

A multi service threaded MD5 cracker
Python
66
star
42

dotfiles

My zsh, bash and vim dot files
Shell
66
star
43

libpe

A C/C++ library to parse Windows portable executables written with speed and stability in mind.
C
64
star
44

brutemachine

A Go library which main purpose is giving an interface to loop over a dictionary and use those words/lines as input for some custom logic such as HTTP file bruteforcing, DNS bruteforcing, etc.
Go
53
star
45

fido

Fido is a minimalistic, IDE and language agnostic project generator supporting various toolchains and build systems.
Python
52
star
46

mpcfw

Reverse engineering of Apple MultipeerConnectivity Framework
Python
51
star
47

quijote

Quijote is an highly configurable HTTP middleware for API security.
Go
48
star
48

altair

A Modular Web Vulnerability Scanner
Python
47
star
49

mcaptcha_bypass

PoC to bypass mCaptcha and its rate limiting capabilities from a fully automated bot.
Rust
47
star
50

ergo-planes-detector

🧠✈ī¸ An ergo based project that relies on a convolutional neural network to detect airplanes from satellite imagery.
Python
44
star
51

stork

A small utility that aims to automate and simplify some tasks related to software release cycles.
Go
43
star
52

SafeInCloud

This repository contains a class to decrypt SafeInCloud (https://www.safe-in-cloud.com/) database files and a couple of command line utilities.
Python
41
star
53

pycryptocat

pyCryptoCat - A CryptoCat standalone python client.
JavaScript
36
star
54

unisbom

UniSBOM is a tool to build a software bill of materials on any platform with a unified data format.
Rust
34
star
55

dunmer

An ELF parasite command injector.
C
31
star
56

evilsocket.github.io

evilsocket.github.io files
HTML
28
star
57

dsploit-arpspoof

The dSploit arpspoof module.
C
27
star
58

octoghost

A python script to process Octopress markdown files and write a JSON file ready to import into Ghost.
Python
27
star
59

backup

Backup scripts I use on my drives.
Shell
26
star
60

SoftWire

SoftWire is a class library written in object-oriented C++ for compiling assembly code. It can be used in projects to generate x86 machine code at run-time as an alternative to self-modifying code. Scripting languages might also benefit by using SoftWire as a JIT-compiler back-end. It also allows to eliminate jumps for variables which are temporarily constant during run-time, like for efficient graphics processing by constructing an optimised pipeline. Because of its possibility for 'instruction rewiring' by run-time conditional compilation, I named it "SoftWire". It is targeted only at developers with a good knowledge of C++ and x86 assembly. Project originally by Nicolas Capens, new implementation by Simone Margaritelli aka evilsocket
C++
25
star
61

SWG

Static Website Generator
Python
23
star
62

hybris

Hybris scripting language interpreter engine and standard library modules.
C++
23
star
63

rubertooth

A complete Ruby porting of the ubertooth libraries and utilities.
Ruby
23
star
64

BioIdentify

BioIdentify is a command line tool for fingerprints feature extraction, 1on1 match and 1onN match .
C++
21
star
65

emoticode

This is the code repository for v2.0 of http://www.emoticode.net/
Ruby
21
star
66

gobench

A simple bash script that does its best to automate and visualize differential benchmarking for Go projects.
Shell
21
star
67

eve

(◍â€ĸīšâ€ĸ) - Hi, I'm Eve
Python
21
star
68

clang-ebpf-builder

A Rust crate that simplifies the integration of Rust and eBPF programs written in C.
C
20
star
69

hackrfpp

HackRF C++ playground plus basic demodulation.
C++
20
star
70

wax

Wax is a mediocre fuzzer I'm prototyping to test some ideas and get rid of others.
Go
19
star
71

pineproxy

A ruby self contained, low resource consuming HTTP transparent proxy designed for the WiFi Pineapple MKV.
Ruby
16
star
72

pwngrid-queries-joe

PwnGRID queries for Joe
Go
15
star
73

octofairy

A machine learning based GitHub bot for Issues.
Python
15
star
74

webmon

A webpage monitor bot, currently used to monitor Twitter ToS.
Python
13
star
75

arc-android-wrapper

An Android wrapper for Arc
Java
13
star
76

twitter-num-followers-bot

A bot that'll monitor the number of followers of its followers and tweet when the counter gets to interesting values.
Go
13
star
77

update.dsploit.net

The code repository for the dSploit project update server.
Ruby
13
star
78

Eigetron

Face recognition program that uses Eigen faces with a fast Jacobi eigen decomposition
C++
13
star
79

Pynject

Pynject - An automatic MySQL injector and data dumper tool.
Python
13
star
80

keras-goodreads

book rating system using LSTM and Goodreads
Python
12
star
81

MSP

Multidimensional Space Processing Library
C++
11
star
82

phpgibson

The phpgibson extension provides an API for communicating with the Gibson cache server.
C
10
star
83

llama3-cake

Distributed LLama3 inference.
Rust
10
star
84

mmdb2json

A script to dump a MMDB ( MaxMind ) binary database to JSON.
Python
9
star
85

backupbox

Sup biatch?
Python
9
star
86

Kerby_Surveillance

Kerby - Light weight video surveillance system.
C++
9
star
87

charon

A Ruby library to query the Zen service on SpamHaus.org
Ruby
6
star
88

evilsocket

about me
6
star
89

42

.
5
star
90

sumphp

PHP client interface to the Sum linear algebra database.
PHP
5
star
91

gauntlet-nginx

Gauntlet IPS nginx module
C
5
star
92

libgibsonclient

Gibson cache server native client library.
C
5
star
93

dsploit-thomson

Assembly
5
star
94

rpi-l298n-motor-driver

Those are experimental scripts to be used with a Raspberry Pi, a L298N chip and two motors.
Python
5
star
95

codeforces

Solutions for the codeforces.com problemset section exercises.
C
5
star
96

caspermod

A modified version of the Casper theme for Ghost blog platform.
CSS
4
star
97

takuan-reports

4
star
98

TED

TED (acronym for neTwork Event Daemon) is a GNU/Linux daemon that will inform you upon new inbound connections, when an host on your network is alive or not, and so on .
C++
4
star
99

sumpy

Python client interface to the Sum linear algebra database.
Python
4
star
100

tman

A small utility to inspect and validate safetensors and ONNX files.
Rust
4
star