• Stars
    star
    324
  • Rank 129,671 (Top 3 %)
  • Language
    Jupyter Notebook
  • License
    Apache License 2.0
  • Created over 7 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Unintended ML Bias Analysis

This repository contains the Sentence Templates datasets we use to evaluate and mitigate unintended machine learning bias in Perspective API. See our accompanying blog post to learn more about how we created these datasets.

This work is part of the Conversation AI project, a collaborative research effort exploring ML as a tool for better discussions online.

NOTE: We moved outdated scripts, notebooks, and other resources to the archive subdirectory. We no longer maintain those resources, but you may find some of the content helpful. In particular, see model_bias_analysis.py for an example of how to analyze model bias.

Background

As part of the Perspective API model training process, we evaluate identity-term bias in our models on synthetically generated and β€œtemplated” test sets. To generate these sets, we plug in identity terms into both toxic and non-toxic template sentences. For example, given templates like β€œI am a <modifier> <identity>”, we evaluate differences in score on sentences like:

β€œI am a kind American"

β€œI am a kind Muslim"

Scores that vary significantly may indicate identity term bias within the model.

For more reading on unintended bias and how we measure bias using the resulting model scores, see:

Usage

We encourage researchers and developers to use these datasets to test for biases in their own models. However, Sentence Templates alone are insufficient for eliminating identity bias in machine learning language models. The examples are simple and unlikely to appear in real-world data and may reflect our own biases. The identity terms also vary across languages because direct word-for-word translation of identity terms across languages is not sufficient, or even possible, given differences in cultures, religions, idioms, and identities.

Copyright and license

All code in this repository is made available under the Apache 2 license. All data in this repository is made available under the Creative Commons Attribution 4.0 International license (CC By 4.0). A full copy of the license can be found at https://creativecommons.org/licenses/by/4.0/

More Repositories

1

perspectiveapi

Perspective is an API that uses machine learning models to score the perceived impact a comment might have on a conversation. See https://developers.perspectiveapi.com for more information.
888
star
2

conversationai-moderator

A machine-assisted human-moderation toolkit.
TypeScript
198
star
3

conversationai-models

A repository to house model building experiments and tools that are part of the Conversation AI effort.
Jupyter Notebook
138
star
4

perspective-viewership-extension

Tune is a Chrome extension that allows users set the "volume" of comment threads online by choosing what comments to read based on Toxicity scores provided by the Perspective API.
TypeScript
86
star
5

harassment-manager

Harassment Manager is a web application that aims to empower users to document and take action on abuse targeted at them on online platforms.
TypeScript
72
star
6

perspectiveapi-authorship-demo

Example code to illustrate how to build an authorship experience using the perspective API
TypeScript
66
star
7

wikidetox

Experiments to help discussion on Wikipedia talk pages
OpenEdge ABL
66
star
8

perspectiveapi-simple-server

A simple nodejs server to allow controlled access to the Perspective API
TypeScript
35
star
9

unhealthy-conversations

A corpus of comments tagged for multiple attributes of unhealthiness.
Jupyter Notebook
33
star
10

conversationai.github.io

Website for conversationai.github.io
HTML
29
star
11

perspective-hacks

JavaScript
28
star
12

perspectiveapi-js-client

A simple example JS/TS client library
TypeScript
20
star
13

conversationai-crowdsource

Project Gold ✨
TypeScript
11
star
14

conversationai-moderator-reddit

Moderator support for reddit
Python
7
star
15

firestore-perspective-toxicity

TypeScript
7
star
16

perspectiveapi-proxy

Example code for an authenticated proxy for requests to the Perspective API
TypeScript
6
star
17

conversationai-moderator-wordpress

Wordpress support for Moderator
PHP
4
star
18

conversationai-moderator-discourse

Discourse support for Moderator
Ruby
3
star
19

perspectiveapi-appsscript

A Google AppsScript project illustrating how to use Perspective in Google Sheets.
TypeScript
1
star