• Stars
    star
    198
  • Rank 195,688 (Top 4 %)
  • Language
  • License
    Other
  • Created about 5 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Combating Fingerprinting with a Privacy Budget

Current state of affairs

Browsers have been making changes to how cookies are treated. Blunt approaches to cookie blocking have been tried, and in response we have seen some user-tracking efforts move underground, employing harder-to-detect methods that subvert cookie controls. These methods, known as ‘fingerprinting’, rely on various techniques to examine what makes a given user’s browser unique.

Because fingerprinting is neither transparent nor under the user’s control, it results in tracking that doesn’t respect user choice.

End state to aim for

Fundamentally, we want to limit how much information about individual users is exposed to sites so that in total it is insufficient to identify and track users across the web, except for possibly as part of large, heterogeneous groups.

There are several ways to quantify the degree to which each user is partially identifiable from the information shared with third parties, including k-anonymity (where k is the number of other users with identical information), entropy (an information-theoretic measure of uncertainty), and differential privacy (ensuring that aggregated data does not reveal the inclusion of an individual’s data in the set). Our maximum tolerance for revealing information about each user is termed the privacy budget.

What is a fingerprinting surface?

A fingerprinting surface is an interaction point where a website can learn something that is either stable or semi-stable for a given user or device and varies between users or devices. A somewhat obvious example of this is anything that reveals the device model or specific device hardware. These can be data returned from Javascript APIs, network identifiers such as headers or IP address, the user’s language, timing differences etc. Put another way, anything that reveals any information about the user can be used as a fingerprinting surface, but for the most part only useful when combined with multiple other sources of information.

How to get there

Measure information exposed by each fingerprinting surface

Each browser can start by instrumenting itself to estimate the information revealed by each surface and report it back through telemetry. This data will be used for the next phase.

Measure total information exposed to each site

This is yet another measurement step where each browser can use the data collected in the previous step to account for the total information exposed to each site and report that back through telemetry. This data will help determine what sort of privacy budget can be enforced without creating large scale web breakage. This measurement will need to include the entirety of a user's interaction with a domain, until site data is cleared. If the telemetry system allows for reporting data about individual domains, it could be used for outreach to sites who are accessing data above the proposed limit to see if changes can be made to bring them under the limit.

Enforce the privacy budget

Once we’re ready to enforce the privacy budget, subsequent API calls that violate the budget will either result in an error being thrown or, if possible, will be replaced with a privacy-preserving version of the API that either returns imprecise or noisy results or a generic result that doesn’t vary between users. Another option to consider is to deny storage and network requests to the page after the budget is hit such that it cannot exfiltrate any new information it learns.

Exceptions

There are certain applications, such as 3D games or video conferencing, that may never be able to run within any reasonable privacy budget. For such applications, there will need to be an “escape valve,” an appropriately worded permission prompt informing the user that granting additional API access may allow the site to identify them.

Certain permission-prompt-inducing API calls may in themselves reveal more information than the privacy budget would allow. Granting permission to such a prompt (e.g. getUserMedia) may grant a budget exemption by itself, but we will need to ensure the prompts communicate that as well.

Passive surfaces

Some fingerprinting surfaces, such as UA string, IP addresses, and accept-language header, are passive in that they are available to every website whether they ask for them or not. For the purposes of privacy budget accounting, we will have to assume that each of these are being consumed by the site and therefore eat into the budget.

Since we would like to provide the web with as much functionality as possible, and since many developers would rather spend their budget on more useful things than knowing the OS of the user’s machine (for example), we should work to remove passive fingerprinting surfaces and replace them with active ways to query the specific information developers need. For the previous example of the user’s OS version, if a website such as a download site had a need for that information we propose allowing them to request it via client hints rather than continue to send it with every request in the UA string. Client hints can similarly be used to query the user’s language for websites that need it, rather than delivering that information to every server via the accept-language header.

More Repositories

1

vimroom

Simulating a vaguely WriteRoom-like environment in Vim.
Vim Script
338
star
2

http-state-tokens

Incrementally better HTTP state management.
HTML
301
star
3

nginx-static-etags

Nginx doesn't generate etags for static content. I'd like it to. Let's see if I can remember some C from college.
C
75
star
4

cookies-over-http-bad

Archived proposal from 2018. Perhaps the approach in mikewest/scheming-cookies will be more successful!
49
star
5

jslint-utils

Wrapper scripts for running JSLint locally, and for generating test reports for Hudson
JavaScript
32
star
6

tc39-proposal-literals

Literals could be different than non-literals.
32
star
7

credentialmanagement

Credential Management
CSS
29
star
8

cookie-incrementalism

Incrementally better cookies.
HTML
22
star
9

content-security-policy

Personal draft of the Web Application Security WG's Content Security Policy specification.
JavaScript
19
star
10

securer-contexts

Secure Contexts, but with _more_ secureness!
18
star
11

baseline-header

What if developers could opt-into better default behaviors en masse, forcing them to pick and choose the legacy risks they want to enable.
17
star
12

palmerized-chrome

https://noncombatant.org/2014/03/11/privacy-and-security-settings-in-chrome/
JavaScript
17
star
13

signature-based-sri

Signature-based Resource Loading Restrictions
16
star
14

deprecating-document-domain

`document.domain` intentionally weakens the only security boundary we have. Perhaps we can dump it?
16
star
15

scheming-cookies

Cookies should take scheme into account, just like every other storage mechanism on the web.
15
star
16

spec-questionnaire

HTML
14
star
17

static_gettext

`gettext` wrapper, enabling localization of static documents and websites.
Python
13
star
18

mcw_templates

A Textpattern plugin enabling import and export of pages, forms, and CSS rules. Unmaintained, released under MIT license.
PHP
12
star
19

strict-csp-for-everyone

This is both a terrible and wonderful idea.
10
star
20

sanitizer-playground

A demonstration of the HTML Sanitizer API.
HTML
10
star
21

vCard

A vCard website. Doesn't everyone have one of these?
Python
8
star
22

Instapaper-Chrome-Extension

A minimal "Send to Instapaper" Chrome extension.
JavaScript
8
star
23

deprecate-it

Deprecate it.
HTML
8
star
24

datarequestor

Pompously described (in _2005_) as "Ajax without the confusing API." Unmaintained, released under MIT license.
JavaScript
8
star
25

mgc

Mike Generated Content; even better than Web 2.0.
HTML
7
star
26

strict-navigation-security

What if HSTS only applied to top-level navigations?
5
star
27

presentations

Various bits and pieces of presentations
JavaScript
5
star
28

fallow

A "teaching myself Ruby and Git by writing a Rack-based blog" project
Ruby
4
star
29

mitigation-supply

Mitigations. Supplied.
HTML
4
star
30

appengine-thingsyoushouldread

Things I think you should read.
Python
4
star
31

contentsecuritypolicy.info

contentsecuritypolicy.info
JavaScript
4
star
32

consider-deploying-corp

Consider deploying Cross-Origin Resource Policy.
HTML
4
star
33

topdown

I'm writing (read: porting) a top-down JavaScript parser to teach myself Python. Weird, eh?
Python
4
star
34

css-parser

A toy css parser, written to teach myself C.
C
4
star
35

internetdrafts

HTML
4
star
36

coop-by-default

Wouldn't it be nice if `Cross-Origin-Opener-Policy` was applied by default?
4
star
37

resource-policy

Wouldn't it be nice if servers could assert more granular things about how a resource should be used?
4
star
38

consolemessages

Maybe a project?
3
star
39

vim-markdown

Vim Markdown runtime files that WON'T leave you sterile
Vim Script
3
star
40

embedding-requires-opt-in

Embedding a document (via `<iframe>`, etc) should require explicit opt-in from the embedee.
3
star
41

hasacdn.net

Nginx configuration for *.hasacdn.net
3
star
42

mikewest.org

HTML
3
star
43

algorithms101

Working through Intro to Algorithms. In JavaScript.
JavaScript
2
star
44

philosophic_li

Code for philosophic.li
2
star
45

artur-yes

HTML
2
star
46

unnamedproject

I need a name.
JavaScript
2
star
47

pysvnhooks

Some Python code to implement subversion hooks.
Python
2
star
48

forum_lddebate_org

The phpBB-based forum.lddebate.org
PHP
2
star
49

mikewest.github.com

projects.mikewest.org
HTML
2
star
50

isolated.website

It's an isolated website. Maybe yours should follow suit?
HTML
2
star
51

sample-i18n

Sample AppEngine i18n Code
Python
2
star
52

frontend-build-scripts

starting on a build script that reads dependencies from the files, rather than hard-coding them...
Python
2
star
53

HTMLOutliner

Python
2
star
54

w15y.com

The beginnings of a project.
JavaScript
2
star
55

perfecttime

JavaScript-based local-timezone based string replacement. Unmaintained, released under MIT license.
JavaScript
2
star
56

flask-pyplaceholder

PyPlaceholder, hosted via Flask.
Python
2
star
57

http-is-https

Post-`forbes.com`, I think we can say that "http" => "https".
HTML
2
star
58

PyPlaceholder

Placeholder images. Generated via the magic of PyPNG!
Python
2
star
59

chromium-dashboard

We got a big TV in the office. I should fill it with something.
JavaScript
2
star
60

texts_lddebate_org

Political and ethical texts relevant to LD debate; crafted in the image of http://federali.st/.
Python
1
star
61

coop

WIP COOP.
HTML
1
star
62

mcw_ma_gnolia

A very, very unmaintained Textpattern plugin for a potentially dead site.
1
star
63

eightyize

80 columns.
Python
1
star
64

upgrade-demo

HTML
1
star
65

nginx_501

A default nginx site that returns nothing but 501.
1
star
66

hillegass-exercises

Working through "Cocoa Programming for Mac OS X". Slowly.
Objective-C
1
star
67

doodlings

Everyone needs a public ~/tmp directory.
Python
1
star
68

categorizing-capabilities

Apps fall into categories. Categories are bound to capabilities.
1
star
69

cookie-samesite-firstparty

HTML
1
star
70

clackity

clackity.io
JavaScript
1
star
71

writeonly

`<input writeonly>`
HTML
1
star