• This repository has been archived on 03/Dec/2019
  • Stars
    star
    391
  • Rank 110,003 (Top 3 %)
  • Language
    Scala
  • License
    MIT License
  • Created over 7 years ago
  • Updated almost 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Dataflow analysis & differential privacy for SQL queries. This project is deprecated and not maintained.

Overview

(This project is deprecated and not maintained.)

This repository contains a query analysis and rewriting framework to enforce differential privacy for general-purpose SQL queries. The rewriting engine can automatically transform an input query into an intrinsically private query which embeds a differential privacy mechanism in the query directly; the transformed query enforces differential privacy on its results and can be executed on any standard SQL database. This approach supports many state-of-the-art differential privacy mechanisms; the code currently includes rewriters based on Elastic Sensitivity and Sample and Aggregate, and more will be added soon.

The rewriting framework is built on a robust dataflow analyses engine for SQL queries. This framework provides an abstract representation of queries, plus several kinds of built-in dataflow analyses tailored to this representation. This framework can be used to implement other types of dataflow analyses, as described below.

Building & Running

This framework is written in Scala and built using Maven. The code has been tested on Mac OS X and Linux. To build the code:

$ mvn package

Example: Query Rewriting

The file examples/QueryRewritingExample.scala contains sample code for query rewriting and demonstrates the supported mechanisms using a few simple queries. To run this example:

mvn exec:java -Dexec.mainClass="examples.QueryRewritingExample"

This example code can be easily modified, e.g., to test different queries or change parameter values.

Background: Elastic Sensitivity

Elastic sensitivity is an approach for efficiently approximating the local sensitivity of a query, which can be used to enforce differential privacy for the query. The approach requires only a static analysis of the query and therefore imposes minimal performance overhead. Importantly, it does not require any changes to the database. Details of the approach are available in this paper.

Elastic sensitivity can be used to determine the scale of random noise necessary to make the results of a query differentially private. For a given output column of a query with elastic sensitivity s, to achieve differential privacy for that column it suffices to smooth s according to the smooth sensitivity approach to obtain S, then add random noise drawn from the Laplace distribution, scaled to (S/epsilon) and centered at 0, to the true result of the query. The smoothing can be accomplished using the smooth sensitivity approach introduced by Nissim et al.

The file examples.ElasticSensitivityExample contains code demonstrating this approach directly (i.e., applying noise manually rather than generating an intrinsically private query).

To run this example:

mvn exec:java -Dexec.mainClass="examples.ElasticSensitivityExample"

Analysis Framework

This framework can perform additional analyses on SQL queries, and can be extended with new analyses. Each analysis in this framework extends the base class com.uber.engsec.dp.sql.AbstractAnalysis.

To run an analysis on a query, call the method com.uber.engsec.dp.sql.AbstractAnalysis.analyzeQuery. The parameter of this method is a string containing a SQL query, and its return value is an abstract domain representing the results of the analysis.

The source code includes several example analyses to demonstrate features of the framework. The simplest example is com.uber.engsec.dp.analysis.taint.TaintAnalysis, which returns an abstract domain containing information about which output columns of the query might contain data flowing from "tainted" columns in the database. The database schema determines which columns are tainted. You can invoke this analysis as follows:

scala> (new com.uber.engsec.dp.analysis.taint.TaintAnalysis).analyzeQuery("SELECT my_col1 FROM my_table")
BooleanDomain = my_col1 -> False

This code includes several built-in analyses, including:

  • The elastic sensitivity analysis, available in com.uber.engsec.dp.analysis.differential_privacy.ElasticSensitivityAnalysis, returns an abstract domain (com.uber.engsec.dp.analysis.differential_privacy.SensitivityDomain) that maps each output column of the query to its elastic sensitivity.
  • com.uber.engsec.dp.analysis.columns_used.ColumnsUsedAnalysis lists the original database columns from which the results of each output column are computed.
  • com.uber.engsec.dp.analysis.histogram.HistogramAnalysis lists the aggregation-ness of each output column of the query (i.e. whether or not the output is an aggregation, and if so, which type).
  • com.uber.engsec.dp.analysis.join.JoinKeysUsed lists the original database columns used as equijoin keys for each output column of the query.

Writing New Analyses

New analyses can be implemented by extending one of the abstract analysis classes and implementing transfer functions which describe how to update the analysis state for relevant query constructs. Analyses are written to update a specific type of abstract domain which represents the current state of the analysis. Each abstract domain type implements the trait com.uber.engsec.dp.dataflow.AbstractDomain.

The simplest way to implement a new analysis is to use com.uber.engsec.dp.dataflow.dp.column.AbstractColumnAnalysis, which automatically tracks analysis state for each column of the query independently. Most of the example analyses are of this type.

New analyses can be invoked in the same way as the built-in example analyses.

Reporting Security Bugs

Please report security bugs through HackerOne.

License

This project is released under the MIT License.

Contact Information

This project is developed and maintained by Noah Johnson and Joe Near.

More Repositories

1

go-torch

Stochastic flame graph profiler for Go programs
Go
3,958
star
2

pyflame

🔥 Pyflame: A Ptracing Profiler For Python. This project is deprecated and not maintained.
C++
2,974
star
3

image-diff

Create image differential between two images
JavaScript
2,453
star
4

makisu

Fast and flexible Docker image building tool, works in unprivileged containerized environments like Mesos and Kubernetes.
Go
2,409
star
5

cpustat

high frequency performance measurements for Linux. This project is deprecated and not maintained.
Go
1,659
star
6

cherami-server

Distributed, scalable, durable, and highly available message queue system. This project is deprecated and not maintained.
Go
1,416
star
7

AthenaX

SQL-based streaming analytics platform at scale
Java
1,224
star
8

plato-research-dialogue-system

This is the Plato Research Dialogue System, a flexible platform for developing conversational AI agents.
Python
977
star
9

npm-shrinkwrap

A consistent shrinkwrap tool
JavaScript
775
star
10

chaperone

A Kafka audit system
Java
640
star
11

coding-challenge-tools

Uber's tools team coding challenge
562
star
12

hyperbahn

Service discovery and routing for large scale microservice operations
JavaScript
394
star
13

phabricator-jenkins-plugin

Jenkins plugin to integrate with Phabricator, Harbormaster, and Uberalls
Java
367
star
14

ohana-ios

Contacts simplified. This project is deprecated and not maintained.
Objective-C
362
star
15

rave

A data model validation framework that uses java annotation processing.
Java
355
star
16

jetstream-ios

An elegant model framework written in Swift
Swift
333
star
17

node-stap

Tools for analyzing Node.js programs with SystemTap. This project is deprecated and not maintained.
JavaScript
291
star
18

r-dom

React DOM wrapper
JavaScript
263
star
19

focuson

A tool to surface security issues in python code
Python
226
star
20

cherami-client-go

Go Client Implementation of Cherami - A distributed, scalable, durable, and highly available message queue system. This project is deprecated and not maintained.
Go
207
star
21

viewport-mercator-project

NOTE: The viewport-mercator-project repo is archived and code has moved to
JavaScript
137
star
22

infer-plugin

Gradle plugin that allows easy integration with the infer static analyzer.
Groovy
126
star
23

express-statsd

Statsd route monitoring middleware for connect/express
JavaScript
126
star
24

android-build-environment

Docker repository for android build environment
122
star
25

in-n-out

A library to perform point-in-geofence searches.
JavaScript
106
star
26

buck-http-cache

An Implementation of Buck's HTTP Cache API as a distributed cache service. This project is deprecated and not maintained.
Shell
101
star
27

statsrelay

A consistent-hashing relay for statsd and carbon metrics
C
101
star
28

hacheck

HAproxy healthcheck proxying service
Python
86
star
29

potter

a CLI to create node.js services
JavaScript
83
star
30

opentracing-go

A general-purpose instrumentation API for distributed tracing systems
Go
82
star
31

idl

A CLI for managing Thrift IDL files
JavaScript
78
star
32

jetstream

Jetstream Sync server framework
JavaScript
73
star
33

canduit

Node.js Phabricator Conduit API client. This project is deprecated and not maintained.
JavaScript
65
star
34

kafka-spraynozzle

A nozzle to spray a kafka topic at an HTTP endpoint. This project is deprecated and not maintained.
Java
49
star
35

usb2fac

Enabling 2fac confirmation for newly connected USB devices
Python
44
star
36

nanny

Cluster management for Node processes
JavaScript
40
star
37

auto-value-bundle

Extends Autovalue to extract data from a bundle into a value object.
Java
36
star
38

node-flame

Tools for analyzing Node.js programs with ptrace. This project is deprecated and not maintained.
JavaScript
29
star
39

Bug-Bounty-Page

A repo to make our changes more transparent to bug bounty researchers in our program (so they can see commits, etc).
29
star
40

paranoid-request

An SSRF-preventing wrapper around Node's request module
JavaScript
26
star
41

lint-trap

JavaScript linter module for Uber projects
JavaScript
26
star
42

thriftify

JavaScript implementation of Thrift encoding and decoding
JavaScript
25
star
43

HackerOneAlchemy

A tool to generate statistics and help manage bug bounty reports in HackerOne.
Python
23
star
44

express-translate

Add simple translation support to Express
JavaScript
21
star
45

cherami-thrift

Thrift APIs for Cherami - A distributed, scalable, durable, and highly available message queue system. This project is deprecated and not maintained.
Go
20
star
46

h1-python

A HackerOne API client for Python
Python
19
star
47

cidrtrie

Trie implementation of a CIDR lookup table
Python
19
star
48

ios-template

This template provides a starting point for open source iOS projects at Uber.
Ruby
18
star
49

tcheck

TChannel health check utility
Go
17
star
50

job_progress

Store the progress of a job
Python
16
star
51

java-code-styles

IntelliJ IDEA code style settings for Uber's Java and Android projects.
15
star
52

fixed-server

Server for HTTP fixtures
JavaScript
14
star
53

vis-academy

A set of tutorials on how our frameworks make effective data visualization applications.
JavaScript
13
star
54

shared-docs

Shared Markdown Documents from Uber Engineering
12
star
55

typed-request-stack

Middleware stack runner for typed HTTP requests
JavaScript
11
star
56

cherami-client-python

Python Client for Cherami - A distributed, scalable, durable, and highly available message queue system. This project is deprecated and not maintained.
Python
11
star
57

failpointsjs

JavaScript
10
star
58

instafork

JavaScript
8
star
59

py-find-unicode

Find incorrect unicode() invocations
Python
8
star
60

shallow-settings

Shallow inheritance-based settings for your application
JavaScript
7
star
61

clusto-query

Silly CLI for querying clusto more quickly
Python
7
star
62

gg

Go dependency debugger
Go
7
star
63

connect-csrf-lite

CSRF validation middleware for Connect/Express
JavaScript
7
star
64

javax-extras

(DEPRECATED) Extra utilities for javax
Java
6
star
65

fixtures-fs

Create a temporary fs with JSON fixtures
JavaScript
6
star
66

redis-delete-pattern

Delete a set of keys from a pattern in Redis
6
star
67

opentracing-python

NOTE: This repository has been retired. The latest OpenTracing APIs can be found in the official repository.
Python
5
star
68

tchannel-gen

Scaffolding for new TChannel w/ Hyperbahn applications
JavaScript
5
star
69

node-dot-arcanist

Uber's .arcanist folder as an npm module
PHP
5
star
70

cherami-client-java

Java Client for Cherami. This project is deprecated and not maintained.
Java
5
star
71

pyrehol

Python wrapper for Firehol
Python
4
star
72

dubstep

This repo is DEPRECATED. See https://github.com/dubstepjs/core
JavaScript
4
star
73

ottr

Easy, robust end-to-end UI tests for web apps
JavaScript
3
star
74

clouseau

A Node.js performance profiler by Uber
JavaScript
3
star
75

vertica-aesgcm-udx

C++
2
star
76

stacked

Go
2
star
77

request-redis-cache

Make requests and cache them in Redis
JavaScript
2
star
78

nodesol-write

Kafka producer.
JavaScript
2
star
79

request-mocha

Request utilities for Mocha
JavaScript
2
star
80

UberBuilder

Make building flexible, immutable objects a simple task
Objective-C
2
star
81

uLeak

DEPRECATED: This is continued in https://github.com/behroozkhorashadi/uLeak
Java
2
star
82

fusion-orchestrate

Tools and scripts for working across multiple fusion repos at once
JavaScript
2
star
83

deck.gl-data-osm

OSM data for the data visualization library deck.gl examples (https://uber.github.io/deck.gl/#/)
1
star
84

uberclass-clouseau

A subclass of uberclass that adds profiling support
JavaScript
1
star
85

backbone-api-client

Backbone mixin built for interacting with API clients
JavaScript
1
star
86

fusion-release

Releases and verifies FusionJS packages
JavaScript
1
star
87

cache-redis

An ES6 Map-like cache with redis backing
JavaScript
1
star
88

redis-broadcast

Write redis commands to a set of redises efficiently
JavaScript
1
star