• Stars
    star
    487
  • Rank 86,857 (Top 2 %)
  • Language
    Python
  • License
    Mozilla Public Li...
  • Created about 6 years ago
  • Updated 5 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Platform for Machine Learning projects on Software Engineering

bugbug

Task Status codecov

Bugbug aims at leveraging machine learning techniques to help with bug and quality management, and other software engineering tasks (such as test selection and defect prediction).

Chat with us in the bugbug Matrix room.

More information on the Mozilla Hacks blog:

Classifiers

  • assignee - The aim of this classifier is to suggest an appropriate assignee for a bug.

  • backout - The aim of this classifier is to detect patches that might be more likely to be backed-out (because of build or test failures). It could be used for test prioritization/scheduling purposes.

  • bugtype - The aim of this classifier is to classify bugs according to their type. The labels are gathered automatically from bugs: right now they are "crash/memory/performance/security". The plan is to add more types after manual labeling.

  • component - The aim of this classifier is to assign product/component to (untriaged) bugs.

  • defect vs enhancement vs task - Extension of the defect classifier to detect differences also between feature requests and development tasks.

  • defect - Bugs on Bugzilla aren't always bugs. Sometimes they are feature requests, refactorings, and so on. The aim of this classifier is to distinguish between bugs that are actually bugs and bugs that aren't. The dataset currently contains 2110 bugs, the accuracy of the current classifier is ~93% (precision ~95%, recall ~94%).

  • devdocneeded - The aim of this classifier is to detect bugs which should be documented for developers.

  • duplicate - The aim of this classifier is to detect duplicate bugs.

  • needsdiagnosis - The aim of this classifier is to detect issues that are likely invalid and don't need to be diagnosed for webcompat use case.

  • qaneeded - The aim of this classifier is to detect bugs that would need QA verification.

  • regression vs non-regression - Bugzilla has a regression keyword to identify bugs that are regressions. Unfortunately it isn't used consistently. The aim of this classifier is to detect bugs that are regressions.

  • regressionrange - The aim of this classifier is to detect regression bugs that have a regression range vs those that don't.

  • regressor - The aim of this classifier is to detect patches which are more likely to cause regressions. It could be used to make riskier patches undergo more scrutiny.

  • spam - The aim of this classifier is to detect bugs which are spam.

  • stepstoreproduce - The aim of this classifier is to detect bugs that have steps to reproduce vs those that don't.

  • testfailure - The aim of this classifier is to detect patches that might be more likely to cause test failures.

  • testselect - The aim of this classifier is to select relevant tests to run for a given patch.

  • tracking - The aim of this classifier is to detect bugs to track.

  • uplift - The aim of this classifier is to detect bugs for which uplift should be approved and bugs for which uplift should not be approved.

Setup and Prerequisites

Install the Python dependencies:

pip3 install -r requirements.txt

You may also need pip install -r test-requirements.txt. Depending on the parts of bugbug you want to run, you might need to install dependencies from other requirement files (find them with find . -name "*requirements*").

Currently, Python 3.9+ is required. You can double check the version we use by looking at setup.py.

Also, libgit2 (needs v1.0.0, only in experimental on Debian), might be required (if you can't install it, skip this step).

sudo apt-get -t experimental install libgit2-dev

Auto-formatting

This project is using pre-commit. Please run pre-commit install to install the git pre-commit hooks on your clone.

Every time you will try to commit, pre-commit will run checks on your files to make sure they follow our style standards and they aren't affected by some simple issues. If the checks fail, pre-commit won't let you commit.

Usage

Training

Run the trainer.py script with the command python -m scripts.trainer (with --help to see the required and optional arguments of the command) to perform training (warning this takes 30min+).

Testing

To use a model to classify a given bug, you can run python -m scripts.bug_classifier MODEL_NAME --bug-id ID_OF_A_BUG_FROM_BUGZILLA. N.B.: If you run the classifier script without training a model first, it will automatically download an already trained model.

Example for the "defect" model

training To train the model for mode defect:

python3 -m scripts.trainer defect

testing To use the model to classify a given bug, you can run python -m scripts.bug_classifier defect --bug-id ID_OF_A_BUG_FROM_BUGZILLA.

Running the repository mining script

Note: This section is only necessary if you want to perform changes to the repository mining script. Otherwise, you can simply use the commits data we generate automatically.

  1. Clone https://hg.mozilla.org/mozilla-central/.
  2. Run ./mach vcs-setup in the directory where you have cloned mozilla-central.
  3. Enable the extensions mentioned in infra/hgrc. For example, if you are on Linux, you can add firefoxtree to the extensions section of the ~/.hgrc file as:
    firefoxtree = ~/.mozbuild/version-control-tools/hgext/firefoxtree
    
  4. Run the repository.py script, with the only argument being the path to the mozilla-central repository.

Note: If you run into problems, it's possible the version of Mercurial you are using is not supported. Check the Docker definition at infra/dockerfile.commit_retrieval to see what we are using in production.

Note: the script will take a long time to run (on my laptop more than 7 hours). If you want to test a simple change and you don't intend to actually mine the data, you can modify the repository.py script to limit the number of analyzed commits. Simply add limit=1024 to the call to the log command.

Structure of the project

  • bugbug/labels contains manually collected labels;
  • bugbug/db.py is an implementation of a really simple JSON database;
  • bugbug/bugzilla.py contains the functions to retrieve bugs from the Bugzilla tracking system;
  • bugbug/repository.py contains the functions to mine data from the mozilla-central (Firefox) repository;
  • bugbug/bug_features.py contains functions to extract features from bug/commit data;
  • bugbug/model.py contains the base class that all models derive from;
  • bugbug/models contains implementations of specific models;
  • bugbug/nn.py contains utility functions to include Keras models into a scikit-learn pipeline;
  • bugbug/utils.py contains misc utility functions;
  • bugbug/nlp contains utility functions for NLP;
  • bugbug/labels.py contains utility functions for handling labels;
  • bugbug/bug_snapshot.py contains a module to play back the history of a bug;
  • bugbug/github.py contains functions to retrieve issues from GitHub for a specified owner/repository.

Using bugbug for non-Mozilla projects

Bugbug is focussing on Mozilla use-cases for Firefox, Bugzilla and GitHub. However, we will be happy to accept pull requests adding support for other projects or bug trackers.

More Repositories

1

pdf.js

PDF Reader in JavaScript
JavaScript
43,965
star
2

DeepSpeech

DeepSpeech is an open source embedded (offline, on-device) speech-to-text engine which can run in real time on devices ranging from a Raspberry Pi 4 to high power GPU servers.
C++
24,221
star
3

send

Simple, private file sharing from the makers of Firefox
FreeMarker
13,225
star
4

sops

Simple and flexible tool for managing secrets
Go
12,778
star
5

BrowserQuest

A HTML5/JavaScript multiplayer game experiment
JavaScript
9,167
star
6

nunjucks

A powerful templating engine with inheritance, asynchronous control, and more (jinja2 inspired)
JavaScript
8,415
star
7

geckodriver

WebDriver for Firefox
6,911
star
8

TTS

๐Ÿค– ๐Ÿ’ฌ Deep learning for Text to Speech (Discussion forum: https://discourse.mozilla.org/c/tts)
Jupyter Notebook
6,749
star
9

readability

A standalone version of the readability lib
JavaScript
6,470
star
10

sccache

Sccache is a ccache-like tool. It is used as a compiler wrapper and avoids compilation when possible. Sccache has the capability to utilize caching in remote storage environments, including various cloud storage options, or alternatively, in local storage.
Rust
5,334
star
11

mozjpeg

Improved JPEG encoder.
C
5,216
star
12

Fira

Mozilla's new typeface, used in Firefox OS
CSS
4,920
star
13

rhino

Rhino is an open-source implementation of JavaScript written entirely in Java
JavaScript
3,956
star
14

shumway

Shumway is a Flash VM and runtime written in JavaScript
TypeScript
3,692
star
15

source-map

Consume and generate source maps.
JavaScript
3,471
star
16

gecko-dev

Read-only Git mirror of the Mercurial gecko repositories at https://hg.mozilla.org. How to contribute: https://firefox-source-docs.mozilla.org/contributing/contribution_quickref.html
2,897
star
17

multi-account-containers

Firefox Multi-Account Containers lets you keep parts of your online life separated into color-coded tabs that preserve your privacy. Cookies are separated by container, allowing you to use the web with multiple identities or accounts simultaneously.
JavaScript
2,594
star
18

bleach

Bleach is an allowed-list-based HTML sanitizing library that escapes or strips markup and attributes
Python
2,590
star
19

web-ext

A command line tool to help build, run, and test web extensions
JavaScript
2,557
star
20

node-convict

Featureful configuration management library for Node.js
JavaScript
2,304
star
21

MozDef

DEPRECATED - MozDef: Mozilla Enterprise Defense Platform
Python
2,173
star
22

cbindgen

A project for generating C bindings from Rust code
Rust
2,157
star
23

popcorn-js

The HTML5 Media Framework. (Unmaintained. See https://github.com/menismu/popcorn-js for activity)
JavaScript
2,148
star
24

webextension-polyfill

A lightweight polyfill library for Promise-based WebExtension APIs in Chrome
JavaScript
2,088
star
25

fathom

A framework for extracting meaning from web pages
JavaScript
1,972
star
26

cipherscan

A very simple way to find out which SSL ciphersuites are supported by a target.
Python
1,912
star
27

hawk

HTTP Holder-Of-Key Authentication Scheme
JavaScript
1,903
star
28

persona

Persona is a secure, distributed, and easy to use identification system.
JavaScript
1,828
star
29

http-observatory

Mozilla HTTP Observatory
Python
1,784
star
30

uniffi-rs

a multi-language bindings generator for rust
Rust
1,783
star
31

neqo

Neqo, an implementation of QUIC in Rust
Rust
1,759
star
32

mentat

UNMAINTAINED A persistent, relational store inspired by Datomic and DataScript.
Rust
1,652
star
33

task.js

Beautiful concurrency for JavaScript
JavaScript
1,635
star
34

hubs

Duck-themed multi-user virtual spaces in WebVR. Built with A-Frame.
JavaScript
1,561
star
35

thimble.mozilla.org

UPDATE: This project is no longer maintained. Please check out Glitch.com instead.
JavaScript
1,423
star
36

fx-private-relay

Keep your email safe from hackers and trackers. Make an email alias with 1 click, and keep your address to yourself.
Python
1,415
star
37

pontoon

Mozilla's Localization Platform
Python
1,396
star
38

kitsune

Platform for Mozilla Support
Python
1,247
star
39

mig

Distributed & real time digital forensics at the speed of the cloud
Go
1,195
star
40

OpenWPM

A web privacy measurement framework
Python
1,150
star
41

bedrock

Making mozilla.org awesome, one pebble at a time
HTML
1,149
star
42

server-side-tls

Server side TLS Tools
HTML
1,114
star
43

grcov

Rust tool to collect and aggregate code coverage data for multiple source files
Rust
1,106
star
44

policy-templates

Policy Templates for Firefox
1,105
star
45

rust-android-gradle

Kotlin
989
star
46

pdfjs-dist

Generic build of PDF.js library.
JavaScript
952
star
47

contain-facebook

Facebook Container isolates your Facebook activity from the rest of your web activity in order to prevent Facebook from tracking you outside of the Facebook website via third party cookies.
JavaScript
945
star
48

narcissus

INACTIVE - http://mzl.la/ghe-archive - The Narcissus meta-circular JavaScript interpreter
JavaScript
901
star
49

openbadges-backpack

Mozilla Open Badges Backpack
JavaScript
861
star
50

addons-server

๐Ÿ•ถ addons.mozilla.org Django app and API ๐ŸŽ‰
Python
833
star
51

awsbox

INACTIVE - http://mzl.la/ghe-archive - A featherweight PaaS on top of Amazon EC2 for deploying node apps
JavaScript
811
star
52

dxr

DEPRECATED - Powerful search for large codebases
Python
804
star
53

ssh_scan

DEPRECATED - A prototype SSH configuration and policy scanner (Blog: https://mozilla.github.io/ssh_scan/)
Ruby
796
star
54

chromeless

DEPRECATED - Build desktop applications with web technologies.
JavaScript
761
star
55

node-client-sessions

secure sessions stored in cookies
JavaScript
745
star
56

playdoh

PROJECT DEPRECATED (WAS: "Mozilla's Web application base template. Half Django, half awesomeness, half not good at math.")
Python
714
star
57

DeepSpeech-examples

Examples of how to use or integrate DeepSpeech
Python
682
star
58

blurts-server

Firefox Monitor arms you with tools to keep your personal information safe. Find out what hackers already know about you and learn how to stay a step ahead of them.
Fluent
679
star
59

tofino

Project Tofino is a browser interaction experiment.
HTML
655
star
60

addon-sdk

DEPRECATED - The Add-on SDK repository.
641
star
61

MozStumbler

Android Stumbler for Mozilla
Java
614
star
62

application-services

Firefox Application Services
Rust
598
star
63

standards-positions

Python
595
star
64

lightbeam

Orignal unmaintained version of the Lightbeam extension. See lightbeam-we for the new one which works in modern versions of Firefox.
JavaScript
587
star
65

moz-sql-parser

DEPRECATED - Let's make a SQL parser so we can provide a familiar interface to non-sql datastores!
Python
574
star
66

firefox-translations

Firefox Translations is a webextension that enables client side translations for web browsers.
JavaScript
571
star
67

spidernode

Node.js on top of SpiderMonkey
JavaScript
560
star
68

inclusion

Our repository for Diversity, Equity and Inclusion work at Mozilla
557
star
69

positron

a experimental, Electron-compatible runtime on top of Gecko
551
star
70

fxa

Monorepo for Firefox Accounts
JavaScript
547
star
71

cargo-vet

supply-chain security for Rust
Rust
547
star
72

ichnaea

Mozilla Ichnaea
Python
539
star
73

addons-frontend

Front-end to complement mozilla/addons-server
JavaScript
525
star
74

tls-observatory

An observatory for TLS configurations, X509 certificates, and more.
Go
518
star
75

neo

INACTIVE - http://mzl.la/ghe-archive - DEPRECATED: See https://neutrino.js.org for alternative
JavaScript
503
star
76

notes

DEPRECATED - A notepad for Firefox
HTML
493
star
77

nixpkgs-mozilla

Mozilla overlay for Nixpkgs.
Nix
490
star
78

django-csp

Content Security Policy for Django.
Python
486
star
79

skywriter

Mozilla Skywriter
JavaScript
481
star
80

Spoke

Easily create custom 3D environments
JavaScript
480
star
81

zamboni

Backend for the Firefox Marketplace
Python
475
star
82

vtt.js

A JavaScript implementation of the WebVTT specification
JavaScript
461
star
83

libdweb

Extension containing an experimental libdweb APIs
JavaScript
441
star
84

FirefoxColor

Theming demo for Firefox Quantum and beyond
JavaScript
437
star
85

pointer.js

INACTIVE - http://mzl.la/ghe-archive - INACTIVE - http://mzl.la/ghe-archive - Normalizes mouse/touch events into 'pointer' events.
JavaScript
435
star
86

mozilla-django-oidc

A django OpenID Connect library
Python
418
star
87

cubeb

Cross platform audio library
C++
411
star
88

agithub

Agnostic Github client API -- An EDSL for connecting to REST servers
Python
410
star
89

fxa-auth-server

DEPRECATED - Migrated to https://github.com/mozilla/fxa
JavaScript
401
star
90

zilla-slab

Mozilla's Zilla Slab Type Family
Shell
391
star
91

r2d2b2g

Firefox OS Simulator is a test environment for Firefox OS. Use it to test your apps in a Firefox OS-like environment that looks and feels like a mobile phone.
JavaScript
391
star
92

masche

Deprecated - MIG Memory Forensic library
Go
387
star
93

qbrt

CLI to a Gecko desktop app runtime
JavaScript
386
star
94

mp4parse-rust

Parser for ISO Base Media Format aka video/mp4 written in Rust.
Rust
380
star
95

valence

INACTIVE - http://mzl.la/ghe-archive - Firefox Developer Tools protocol adapters (Unmaintained)
JavaScript
377
star
96

OpenDesign

Mozilla Open Design aims to bring open source principles to Creative Design. Find us on Matrix: chat.mozilla.org/#/room/#opendesign:mozilla.org
367
star
97

reflex

Functional reactive UI library
JavaScript
364
star
98

mortar

INACTIVE - http://mzl.la/ghe-archive - A collection of web app templates
364
star
99

minion

Minion
354
star
100

makedrive

[RETIRED] Webmaker Filesystem
JavaScript
352
star