• Stars
    star
    609
  • Rank 73,614 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created almost 5 years ago
  • Updated almost 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

riskquant

A library to assist in quantifying risk.

To install riskquant run the following command in root directory:

pip install .

Using riskquant as a library

simpleloss

The simpleloss class uses a single value for frequency, and two values for a magnitude range that are mapped to a lognormal distribution.

The inputs to simpleloss are as follows:

  • Identifier: An identifying label, which need not be unique, (see "Uniqueness of identifiers" below) but is intended to provide a way of identifying the scenario.
  • Name: The name of the scenario, which should be more descriptive than the identifier.
  • Frequency: The number of times that the loss will occur over some interval of time (e.g. 0.1 means 1 occurrence per 10 years on average).
  • Low loss magnitude: A dollar value of the best-case scenario, given that the loss did occur. All our detection systems worked, so we found out about the event and remediated quickly.
  • High loss magnitude: A dollar value of the worst-cases scenario. Our detection systems didn't work, and the problem persisted for a long time until it was unavoidable to notice it and stop it.
>> from riskquant import simpleloss
>> s = simpleloss.SimpleLoss(label="ALICE", name="Alice steals the data", frequency=0.10, low_loss=100000, high_loss=1000000)
>> s.annualized_loss()

40400.128269457266

pertloss

The pertloss class uses two values for a magnitude range that are mapped to a lognormal distribution, and four values for frequency that are used to create a Modified PERT distribution.

The inputs to pertloss are as follows:

  • Low loss magnitude: A dollar value of the best-case scenario, given that the loss did occur. All our detection systems worked, so we found out about the event and remediated quickly.
  • High loss magnitude: A dollar value of the worst-cases scenario. Our detection systems didn't work, and the problem persisted for a long time until it was unavoidable to notice it and stop it.
  • Minimum frequency: The lowest number of times a loss will occur over some interval of time
  • Maximum frequency: The highest number of times a loss will occur over some interval of time
  • Most likely frequency: The most likely number of times a loss will occur over some interval of time. Sets the skew of the distribution.
  • Kurtosis: A number that controls the shape of the PERT distribution, with a default of 4. Higher values will cause a sharper peak. In FAIR, this is called the "belief in the most likely" frequency, based on the confidence of the estimator in the most likely frequency. With higher kurtosis, more samples in the simulation will be closer to the most likely frequency.
>> from riskquant import pertloss
>> p = pertloss.PERTLoss(low_loss=10, high_loss=100, min_freq=0.1, max_freq=0.7, most_likely_freq=0.3, kurtosis=1)
>> simulate_100 = p.simulate_years(100)
>> p.summarize_loss(simulate_100)

{'minimum': 0,
 'tenth_percentile': 0,
 'mode': 0,
 'median': 1,
 'ninetieth_percentile': 2,
 'maximum': 6}

Using riskquant as a utility

bin/riskquant --file input.csv

The input.csv file should be formatted with columns

Identifier,Name,Probability,Low_loss,High_loss

The columns are defined as follows:

  • Identifier: An identifying label, which need not be unique, (see "Uniqueness of identifiers" below) but is intended to provide a way of identifying the scenario.
  • Name: The name of the scenario, which should be more descriptive than the identifier.
  • Probability: The chance that the loss will occur over some interval of time (typically a year). More precisely, this is defined as a rate of ocurrence (e.g. 0.1 means 1 occurrence per 10 years on average).
  • Low_loss: The best-case scenario, given that the loss did occur. All our detection systems worked, so we found out about the event and remediated quickly.
  • High_loss: The worst-cases scenario. Our detection systems didn't work, and the problem persisted for a long time until it was unavoidable to notice it and stop it.

The range [Low_loss, High_loss] should cover the 90% "confidence interval" of losses, i.e. we are 90% confident that the loss would fall in that range. Given this range, we map to a Lognormal distribution, which allows for no negative loss values, and for a long tail of high losses (allowing for blowout losses to occur that exceed the "worst case" scenario.

Also, the Probability is mapped to a Poisson function so that the loss could actually occur more than once a year, but on average occurs at the rate given.

For example:

ALICE,"Alice steals the data",0.01,1000000,10000000
BOB,"Bob steals the data",0.10,10000000,1000000000
CHARLIE,"Charlie loses the data",0.05,5000000,50000000

When run as an executable, the CSV input is converted to a CSV with a prioritized list of loss scenarios:

BOB,Bob steals the data,"$26,600,000"
CHARLIE,Charlie loses the data,"$1,010,000"
ALICE,Alice steals the data,"$40,400"

Note that by default the output uses 3 significant digits, which is generally more than enough to capture the precision of the inputs.

By default, the executable will also generate a Loss Exceedance Curve (LEC) which is a statistical description of the combined risk due to all the loss scenarios. The curve is generated by simulating many possible years and summing the losses across all losses that occurred in that simulated year. We then ask: "What loss magnitude was exceeded in 90% of the simulated years", 80%, 70% and so on. The curve therefore shows the chance (y-axis) of exceeding some particular loss amount (x-axis) in aggregate across all the scenarios.

Command line arguments

Required argument:

--file : CSV of scenario name and parameters

Optional argument:

-V / --version : Version number
--years <n> : number of years to simulate' [ default 100,000 ]
--sigdigs <n> : number of significant digits in output values [ default 3 ]
--plot : Generate Loss Exceedance Curve [ default true ]

Using riskquant via Docker

Here's how to build and run riskquant using Docker.

First, build the image locally:

docker build -t riskquant .

This step only needs to be done once for a given version of riskquant.

Then, assuming your inputs are in a data sub-directory, you can analyze them with riskquant using a command like:

docker container run --rm -it \
  -v "$(PWD)/data/":/data/ \
  riskquant --file /data/input.csv

When running riskquant via Docker, Docker needs to mount a local directory into the container so that riskquant can read inputs from and write outputs to that directory. This command mounts the local data directory into the container at /data. The --file option tells riskquant where to find the file when running inside the container. You can pass other options to riskquant just like normal, including --help.

Now, check the analysis results with a command like:

cat data/input_prioritized.csv

Uniqueness of identifiers

An identifier could be duplicated, for example, if we distinguish between an internal actor and an external attacker as variants of the same loss scenario, but with different (and independent) probabilities and impacts.

help(riskquant)

Help on package riskquant:

NAME
   riskquant - Risk quantification library. Import models in this package as needed.

PACKAGE CONTENTS
   multiloss
   simpleloss

FUNCTIONS
   csv_to_simpleloss(file)
       Convert a csv file with parameters to SimpleModel objects

       :arg: file = Name of CSV file to read. Each row should contain rows with
                    name, probability, low_loss, high_loss

       :returns: List of SimpleLoss objects

   main(args=None)

DATA
   NAME_VERSION = 'riskquant 1.0'
   __appname__ = 'riskquant'

VERSION
   1.0

help(riskquant.simpleloss)

    |
    |  __init__(self, label, name, p, low_loss, high_loss)
    |      Initialize self.  See help(type(self)) for accurate signature.
    |
    |  annualized_loss(self)
    |      Expected mean loss per year as scaled by the probability of occurrence
    |
    |      :returns: Scalar of expected mean loss on an annualized basis.
    |
    |  simulate_losses_one_year(self)
    |      Generate a random number of losses, and loss amount for each.
    |
    |      :returns: List of loss amounts, or empty list if no loss occurred.
    |
    |  simulate_years(self, n)
    |      Draw randomly to simulate n years of possible losses.
    |
    |      :arg: n = Number of years to simulate
    |      :returns: List of length n with loss amounts per year. Amount is 0 if no loss occurred.
    |
    |  single_loss(self)
    |      Draw a single loss amount. Not scaled by probability of occurrence.
    |
    |      :returns: Scalar value of a randomly generated single loss amount.
    |
    |  ----------------------------------------------------------------------
    |  Data descriptors defined here:
    |
    |  __dict__
    |      dictionary for instance variables (if defined)
    |
    |  __weakref__
    |      list of weak references to the object (if defined)

DATA
   lognorm = <scipy.stats._continuous_distns.lognorm_gen object>
   norm = <scipy.stats._continuous_distns.norm_gen object>

help(riskquant.multiloss)

    |  Methods defined here:
    |
    |  __init__(self, loss_list)
    |      Initialize self.  See help(type(self)) for accurate signature.
    |
    |  loss_exceedance_curve(self, n, title='Aggregated Loss Exceedance', xlim=[1000000, 10000000000], savefile=None)
    |      Generate the Loss Exceedance Curve for the list of losses. (Uses simulate_years)
    |
    |      :arg: n = Number of years to simulate and display the LEC for.
    |            [title] = An alternative title for the plot.
    |            [xlim] = An alternative lower and upper limit for the plot's x axis.
    |            [savefile] = Save a PNG version to this file location instead of displaying.
    |
    |      :returns: None. If display=False, returns the matplotlib axis array
    |                (for customization).
    |
    |  prioritized_losses(self)
    |      Generate a prioritized list of losses from the loss list.
    |
    |      :returns: List of [(name, annualized_loss), ...] in descending order of annualized_loss.
    |
    |  simulate_years(self, n)
    |      Simulate n years across all the losses in the list.
    |
    |      :arg: n = The number of years to simulate
    |
    |      :returns: List of [loss_year_1, loss_year_2, ...] where each is a sum of all
    |                losses experienced that year.
    |
    |  ----------------------------------------------------------------------
    |  Data descriptors defined here:
    |
    |  __dict__
    |      dictionary for instance variables (if defined)
    |
    |  __weakref__
    |      list of weak references to the object (if defined)

More Repositories

1

Scumblr

Web framework that allows performing periodic syncs of data sources and performing analysis on the identified results
Ruby
2,643
star
2

stethoscope

Personalized, user-focused recommendations for employee information security.
Python
2,002
star
3

sleepy-puppy

Sleepy Puppy XSS Payload Management Framework
JavaScript
1,029
star
4

sketchy

A task based API for taking screenshots and scraping text from websites.
JavaScript
996
star
5

diffy

â›” (DEPRECATED) Diffy is a triage tool used during cloud-centric security incidents, to help digital forensics and incident response (DFIR) teams quickly identify suspicious hosts on which to focus their response.
Python
632
star
6

aardvark

Aardvark is a multi-account AWS IAM Access Advisor API
Python
470
star
7

stethoscope-app

A desktop application that checks security-related settings and makes recommendations for improvements without requiring central device management or automated reporting.
JavaScript
457
star
8

policyuniverse

Parse and Process AWS IAM Policies, Statements, ARNs, and wildcards.
Python
421
star
9

zerotodocker

Dockerfiles to be used to create Dockerhub trusted builds of NetflixOSS
Python
407
star
10

rewrite

Distributed code search and refactoring for Java
Java
291
star
11

gcviz

Garbage Collector Visualization Tool/Framework
Python
266
star
12

repulsive-grizzly

Application Layer DoS Testing Framework
Python
244
star
13

hystrix-dashboard

JavaScript
233
star
14

jvmquake

A JVMTI agent that attaches to your JVM and kills it when things go sideways
Python
154
star
15

zerotocloud

Scripts and instructions for Zero To Cloud With NetflixOSS
Groovy
147
star
16

bpftoolkit

Shell
128
star
17

aws-credential-compromise-detection

Example detection of compromise credentials in AWS
Python
118
star
18

WSPerfLab

Project for testing web-service implementations.
Java
116
star
19

UnrealValidationFramework

C++
111
star
20

cloudy-kraken

AWS Red Team Orchestration Framework
Python
102
star
21

historical

A serverless, event-driven AWS configuration collection service with configuration versioning.
Python
93
star
22

jenkins-cli

Simple Jenkins Command Line Interface
Perl
91
star
23

swag-client

Cloud multi-account metadata management tool.
Python
87
star
24

cloudtrail-anomaly

Python
82
star
25

cloudaux

Cloud Auxiliary is a python wrapper and orchestration module for interacting with cloud providers
Python
76
star
26

aws-metadata-proxy

AWS Metadata Proxy for protection against SSRF
Go
69
star
27

service-capacity-modeling

Python
61
star
28

titus-isolate

Python
55
star
29

skunky

Marking instances dirty since 2018
Python
47
star
30

raven-python-lambda

Sentry/Raven SDK Integration For AWS Lambda (python) and Serverless
Python
47
star
31

dynaslave-plugin

Jenkins DynaSlave plugin
Java
46
star
32

s3-flash-bootloader

A tool for flashing OS images onto stateful servers
Shell
45
star
33

rl_for_budget_constrained_recs

Jupyter Notebook
41
star
34

logstash-configs

Logstash Configs used by Netflix
31
star
35

spectatord

A high performance metrics daemon
C++
25
star
36

framerate-utils

Useful conversion utilities for working with video frame rate and display
TypeScript
17
star
37

qiro

The Qiro Project
Java
17
star
38

listening-test-app

C++
16
star
39

iep-apps

Example apps using Netflix Insight libraries from the Spectator, Atlas, and IEP projects.
Scala
15
star
40

zerotocloud-gradle

Gradle Plugin to Initialize the Cloud Environment and Utilize it for Continuous Delivery Purposes
Groovy
15
star
41

stethoscope-examples

Example Express application for collecting data from the Stethoscope app
HTML
14
star
42

bucketsnake

An AWS lambda function that grantsss S3 permissionsss at ssscale.
Python
14
star
43

causaltransportr

R package to generalize and transport causal effects.
R
12
star
44

Numerus

Counters, Percentiles, etc for in-memory metrics capture.
Java
12
star
45

mesos-on-pi

Shell
12
star
46

nfflink-connector-iceberg

Java
11
star
47

swag-api

REST API and UI for SWAG data
Python
10
star
48

repokid-extras

Python
10
star
49

raven-sqs-proxy

A Raven/Sentry SQS message proxy forwarder
Python
10
star
50

atlas-node-client

C++
10
star
51

post2crucible

Crucible code review uploader client
Java
8
star
52

StethoscopeMobile

JavaScript
8
star
53

netflixoss-dsl-seed

DSL Scripts to create build jobs for @NetflixOSS projects
Groovy
7
star
54

grails-jade

Grails plugin for rendering Jade templates with the spring-jade4j library
Groovy
6
star
55

spectator-js-nodejsmetrics

Generate node.js internal metrics using the nflx-spectator node module
JavaScript
6
star
56

historical-reports

Lambda functions to generate report artifacts from Historical
Python
6
star
57

cligraphy

Python
5
star
58

atlas-system-agent

Agent that reports system metrics through SpectatorD.
C++
5
star
59

node-pagerduty-netflix

pagerduty REST API interface in node.js
JavaScript
5
star
60

swag-functions

Lambda functions for SWAG management
Python
4
star
61

ng-nflx

Miscellaneous utilities for AngularJS
JavaScript
4
star
62

grails-context-param

Grails plugin to automatically add parameters specified as @ContextParam on a controller to redirect calls.
Groovy
4
star
63

hive2iceberg-migration

Scala
3
star
64

ec2blockdevcfg

Tools and configuration for Amazon EC2 NVMe block devices
Python
2
star
65

kmd

JavaScript
2
star
66

atlas-native-client

C++
2
star
67

qiro-logo

Code for generating the qiro logo
Java
2
star
68

corepipe

Rust
2
star
69

flagpole

Flag arg parser to build out a dictionary with optional keys.
Python
1
star
70

scumblr-spillguard

Python
1
star
71

adversarial_approach_to_recommender_systems

Python
1
star