• Stars
    star
    408
  • Rank 105,946 (Top 3 %)
  • Language
    TypeScript
  • License
    MIT License
  • Created about 7 years ago
  • Updated almost 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

BLeak: Automatically Debugging Memory Leaks in Web Applications

BLeak v1.2.2

Build Status Build status npm version Coverage Status

BLeak automatically finds, ranks, and diagnoses memory leaks in the client-side of web applications.

BLeak uses a short developer-provided script to drive the application in a loop through specific visual states (e.g., the inbox view and email view of a mail client) as an oracle to find memory leaks. In our experience, BLeak's precision is often 100% (e.g., no false positives), and fixing the leaks it finds reduces heap growth by 94% on average on a corpus of real production web apps.

For more information please see the BLeak website and the our academic paper, which appeared at PLDI 2018.

Prerequisites

The following must be installed for BLeak to work:

  • mitmproxy V4 (Tested with 4.0.1)
  • Python 3.6 or greater
    • Our mitmproxy plugin uses new Python async features

Also, make sure port 8080 is free, as that is the port that MITMProxy uses by default.

Installing

npm install -g bleak-detector

After installing, you should be able to run bleak from the command line.

Using

  1. Build BLeak (see above).
  2. Write a configuration file for your web application (see below).
  3. Run bleak run --config path/to/config.js --out path/to/where/you/want/output
    • The output directory should be unique for this specific run of BLeak, otherwise it will overwrite files in the directory. It will be created if needed.
  4. Wait. BLeak typically runs in <10 minutes, but its speed depends on the number of states in your loop and the speed of your web application.
  5. Run the BLeak Results Viewer by running bleak viewer and navigating to http://localhost:8889/ in a web browser. Upload path/to/where/you/want/output/bleak_results.json to the web application to view the results!
    • Alternatively, BLeak prints out a report in bleak_report.log in the same directory, but the results viewer presents additional information not captured in that log file.

Configuration File

BLeak uses a configuration file to find memory leaks in the client-side of a web application. Only a few fields are required.

// URL to the web application.
exports.url = "http://path/to/my/site";
// Runs your program in a loop. Each item in the array is a `state`. Each `state` has a "check"
// function, and a "next" function to transition to the next state in the loop. These run
// in the global scope of your web app.
// BLeak assumes that the app is in the first state when it navigates to the URL. If you specify
// optional setup states, then it assumes that the final setup state transitions the web app to
// the first state in the loop.
// The last state in the loop must transition back to the first.
exports.loop = [
  // First state
  {
    // Return 'true' if the web application is ready for `next` to be run.
    check: function() {
      // Example: `group-listing` must be on the webpage
      return !!document.getElementById('group-listing');
    },
    // Transitions to the next state.
    next: function() {
      // Example: Navigate to the first thread
      document.getElementById("thread-001").click();
    }
  },
  // Second (and last) state
  {
    check: function() {
      // Example: Make sure the body of the thread has loaded.
      return !!document.getElementById('thread-body');
    },
    // Since this is the last state in the loop, it must transition back to the first state.
    next: function() {
      // Example: Click back to group listing
      document.getElementById('group-001').click();
    }
  }
];

// (Optional) Number of loop iterations to perform during leak detection (default: 8)
exports.iterations = 8;

// (Optional) An array of states describing how to login to the application. Executed *once*
// to set up the session. See 'config.loop' for a description of a state.
exports.login = [
  {
    check: function() {
      // Return 'true' if the element 'password-field' exists.
      return !!document.getElementById('password-field');
    },
    next: function() {
      // Log in to the application.
      const pswd = document.getElementById('password-field');
      const uname = document.getElementById('username-field');
      const submitBtn = document.getElementById('submit');
      uname.value = 'spongebob';
      pswd.value = 'squarepants';
      submitBtn.click();
    }
  }
];
// (Optional) An array of states describing how to get from config.url to the first state in
// the loop. Executed each time the tool explicitly re-navigates to config.url. See
// config.loop for a description of states.
exports.setup = [

];
// (Optional) How long (in milliseconds) to wait for a state transition to finish before declaring an error.
// Defaults to 10 minutes
exports.timeout = 10 * 60 * 1000;
// (Optional) How long (in milliseconds) to wait between a check() returning 'true' and transitioning to the next step or taking a heap snapshot.
// Default: 1000
exports.postCheckSleep = 1000;
// (Optional) How long (in milliseconds) to wait between transitioning to the next step and running check() for the first time.
// Default: 0
exports.postNextSleep = 0;
// (Optional) How long (in milliseconds) to wait between submitting login credentials and reloading the page for a run.
// Default: 5000
exports.postLoginSleep = 5000;
// (Optional) An array of numerical IDs identifying leaks with fixes in your code. Used to
// evaluate memory savings with different leak configurations and the effectiveness of bug fixes.
// In the code, condition the fix on $$$SHOULDFIX$$$(ID), or add logic to `exports.rewrite` (see below),
// and BLeak will run the web app with the fixes applied.
exports.fixedLeaks = [0, 1, 2];
// (Optional) Proxy re-write rule that runs in a Node.js environment, *not* in the browser.
// Lets you rewrite the web app's JavaScript/HTML/CSS to test bug fixes. Especially useful for evaluating
// fixes on web apps you do not control.
// Return a Node.js Buffer containing the replacement resource contents, or the original contents if not
// modifying.
exports.rewrite = function(url /* URL of the resource */,
                  type /* MIME type of resource */,
                  data /* Contents of resource, as a Node.js Buffer */,
                  fixes /* Array of numerical IDs corresponding to bug fixes that are active during the session (see fixedLeaks) */) {
  function hasFix(n) {
    return fixes.indexOf(n) !== -1;
  }
  // Example: Filter out non-JavaScript resources.
  if (type.indexOf("javascript") !== -1) {
    if (url.indexOf("19/common.js") !== -1) {
      let src = data.toString();
      // Example: Replace a specific string in `19/common.js` to fix bug 0.
      if (hasFix(0)) {
        src = src.replace(`window.addEventListener("scroll",a,!1)`, 'window.onscroll=a');
      }
      return Buffer.from(src, 'utf8');
    }
  }
  return data;
};

Developing

Interested in fixing bugs or building on BLeak? Excellent! Read below on how to build BLeak from source and run our unit tests.

Prerequisites

  • Yarn package manager
    • NPM may work, but we do not test against it

Building

# Install NPM dependencies (only need to run once)
yarn install
# Build BLeak
yarn run build

Testing

yarn test

Debugging Tips

The bleak executable (runnable via ./bleak once built) has a number of useful debug commands. For example, use proxy-session to debug issues with BLeak's proxy / diagnoses phase.

More Repositories

1

scalene

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
Python
12,131
star
2

coz

Coz: Causal Profiling
C
4,024
star
3

browsix

Browsix is a Unix-like operating system for the browser.
JavaScript
3,149
star
4

doppio

Breaks the browser language barrier (includes a plugin-free JVM).
TypeScript
2,150
star
5

Mesh

A memory allocator that automatically reduces the memory footprint of C/C++ applications.
C++
1,618
star
6

ChatDBG

ChatDBG - AI-assisted debugging. Uses AI to answer 'why'
C++
772
star
7

slipcover

Near Zero-Overhead Python Code Coverage
Python
485
star
8

cwhy

"See why!" Explains and suggests fixes for compile-time errors for C, C++, C#, Go, Java, LaTeX, PHP, Python, Ruby, Rust, and TypeScript
C++
272
star
9

sqlwrite

SQLwrite: AI in your DBMS! Automatically converts natural language queries to SQL.
C
106
star
10

NextDoor

Graph Sampling using GPU
Cuda
49
star
11

DataDebug

Excel 2010/2013 add-in that automatically finds errors in spreadsheets
C#
46
star
12

coverup

Automatic AI-powered test suite generator
Python
37
star
13

systemgo

Init system in Go, intended to run on Browsix and other Unix-like OS. Part of GSoC 2016 project.
Go
36
star
14

sheriff

Sheriff consists of two tools: Sheriff-Detect, a false-sharing detector, and Sheriff-Protect, a false-sharing eliminator that you can link with your code to eliminate false sharing.
C++
29
star
15

DoubleTake

Evidence-based dynamic analysis: a fast checker for memory errors.
C
21
star
16

commentator

Automatically comments Python code, adding docstrings and type annotations, with optional translation to other languages.
Python
20
star
17

Predator

Predator: Predictive False Sharing Detection
C
19
star
18

memory-landscape

The space of memory management research and systems produced by the PLASMA lab (https://plasma-umass.org).
16
star
19

snakefish

parallel Python
Python
13
star
20

entroprise

measure entropy of memory allocators
C++
12
star
21

parcel

An Excel formula parser
C#
12
star
22

Rehearsal

Rehearsal: A Configuration Verification Tool for Puppet
Scala
12
star
23

Hound

Hound memory leak detector
C++
11
star
24

smash-project

Smash compressing allocator project
C++
10
star
25

browsix-spec

JavaScript
9
star
26

Archipelago

Archipelago memory allocator
C
8
star
27

simplesocket

A simple socket wrapper for C++.
C++
8
star
28

pythoness

Pythoness: use natural language to define Python functions.
Python
7
star
29

compsci631

Support code for Programming Languages (COMPSCI631)
OCaml
7
star
30

Tortoise

Tortoise: Interactive System Configuration Repair
Scala
6
star
31

scalene-gui

Scalene web GUI
JavaScript
5
star
32

llm-utils

Utilities for our LLM projects (CWhy, ChatDBG, ...).
Python
5
star
33

transparentFS

TransparentFS code, paper, and slides
C
5
star
34

homebrew-scalene

Homebrew tap for Scalene (emeryberger/scalene)
Ruby
4
star
35

GSoC

Description of our Google Summer of Code projects for 2015
4
star
36

vam

Implementation from "A Locality-Improving Dynamic Memory Allocator", Feng and Berger, MSP 2005
C++
4
star
37

HeapToss

HeapToss is an LLVM compiler pass that moves stack variables that may escape their declaring function's context into the heap.
3
star
38

pytest-cleanslate

Python
3
star
39

jsvm

JavaScript
2
star
40

GSoC-2013

Google Summer of Code 2013
2
star
41

plasma-umass.github.io

home page
HTML
2
star
42

spl

Rust
2
star
43

doppio_jcl

Scripts that produce a version of the Java Class Library and Java Home in a way that is compatible with DoppioJVM.
TypeScript
2
star
44

nextdoor-eurosys21

HTML
1
star
45

mesh-testsuite

C
1
star
46

ChatSheet

Python
1
star
47

custom-public

Jupyter Notebook
1
star
48

proto

probabilistic race tolerance
C
1
star
49

wasm-gc-template

C++
1
star
50

typissed

Generates MTurk typo jobs
C#
1
star
51

scalene-benchmarks

Benchmarks comparing Scalene with other commonly-used profilers
Python
1
star
52

emcc_control

C
1
star
53

transparentMM

Transparent memory management
1
star