• Stars
    star
    170
  • Rank 223,357 (Top 5 %)
  • Language
    TypeScript
  • License
    Apache License 2.0
  • Created over 3 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

The Data Cards Playbook helps dataset producers and publishers adopt a people-centered approach to transparency in dataset documentation.

Data Cards Playbook

The Data Cards Playbook helps dataset producers and publishers adopt a people-centered approach to transparency in dataset documentation. Using the Playbook activities and resources on our website, you can create transparency-focused metadata schema for datasets across domains, organizational structures, and audience groups

In this repository, you can:

  • Explore templates of Transparency Artifacts (Data Cards, Model Cards, Healthsheets)
  • See and contribute examples of Data Cards in this repository

Data Cards

Data Cards are structured summaries of essential facts about various aspects of ML datasets needed by stakeholders across a dataset's lifecycle for responsible AI development. These summaries provide explanations of processes and rationales that shape the data and consequently the models, such as upstream sources, data collection and annotation methods; training and evaluation methods, intended use; or decisions affecting model performance.

Read our paper on Data Cards

Watch the paper video from FAccT 2022

Hands-on Data Card creation

Our Data Card template is available in .docx format. It contains numerous sections, questions and guidelines for responses that are designed to comprehensively document any possible dataset.

Along with Data Cards, we've also made Healthsheets(Research Paper) and Model Card (Research Paper) templates available to document healthcare-specific datasets and general purpose models, respectively.

Examples of Data Cards

Want to add your Data Card to this list? Open an issue!

Frequently Asked Questions (FAQs)

Coming Soon

Note

The Data Cards Playbook is being actively developed and documentation is likely to change as we improve our methodologies. We want to hear from you! Leave notes, feedback, or suggestions on our GitHub. Use #datacardsplaybook.

Citation

M. Pushkarna, A. Zaldivar, D. Nanas, et al. Data Cards Playbook. Published March 5, 2021.

License

The Data Cards Playbook is licensed under a Creative Commons Attribution-Share Alike 4.0 International License.

Credits

Core Team

This work was co-created by Mahima Pushkarna and Andrew Zaldivar and done in collaboration with Reena Jana, Vivian Tsai, and Oddur Kjartansson. We want to thank Donald Gonzalez, Dan Nanas, Parker Barnes, Laura Rosenstein, Diana Akrong, Monica Caraway, Ding Wang, Danielle Smalls, Aybuke Turker, Emily Brouillet, Andrew Fuchs, Sebastian Gehrmann, Cassie Kozyrkov, Alex Siegman, and Anthony Keene for their immense contributions; and Meg Mitchell and Timnit Gebru for championing this work.

We also want to thank Adam Boulanger, Lauren Wilcox, Parker Barnes, Roxanne Pinto and Ayรงa ร‡akmakli for their feedback; Tulsee Doshi, Dan Liebling, Meredith Morris, Lucas Dixon, Fernanda Viegas, Jen Gennai, and Marian Croak for their support. This work would not have been possible without our workshop and study participants, and numerous partners, whose insights and experiences have shaped this Playbook.

Special Thanks

This work would not have been possible without our workshop participants, supporters and champions, whose insights and experiences have shaped this Playbook: Lucas Ackerknecht, Hartwig Adam, Seiji Armstrong, Lora Aroyo, Sebastian Assaf, Anurag Batra, Samy Bengio, Louisa Bostrom, Thomas Cadwalader, Michelle Carney, Will Carter, Amanda Casari, Di Dang, Alex David Norton, Tiffany Deng, Emily Denton, Tulsee Doshi, Madeleine Elish, Patrick Gage Kelley, Timnit Gebru, Sara Goetz, Robbie Gonzalez, Alex Hanna, Jing Hua, Ben Hutchinson, Nathan Ie, Robyn Im, Orion Jankowski, Ellen Jiang, Shivani Kapania, David Karam, Daniel Kim, Leslie Lai, Eryka Lehr, Elijah Logan, Daphne Luong, Nicole Maffeo, Meg Mitchell, Maysam Moussalem, Unni Nair, Ricardo Olenewa, Kristen Olson, Praveen Paritosh, Adam Pearce, Angie Peng, Ludovic Peran, Roxanne Pinto, Vinodkumar Prabhakaran, Rida Qadri, Ravi Rajakumar, Hima Rajana, Susanna Ricco, Kevin Robinson, Taylor Roper, Negar Rostamzadeh, Mo Shomrat, Andrew Smart, Jamila Smith-Loud, Nithum Thain, Janel Thamkul, Aybuke Turker, Joseph Thomas, Bobby Tran, James Wang, Martin Wattenberg, James Wexler, Catherine Williams, Catherina Xu, Tabitha Yong, and Ben Zevenbergen.

More Repositories

1

facets

Visualizations for machine learning datasets
Jupyter Notebook
7,345
star
2

lit

The Learning Interpretability Tool: Interactively analyze ML models to understand their behavior in an extensible and framework agnostic interface.
TypeScript
3,482
star
3

saliency

Framework-agnostic implementation for state-of-the-art saliency methods (XRAI, BlurIG, SmoothGrad, and more).
Jupyter Notebook
951
star
4

what-if-tool

Source code/webpage/demos for the What-If Tool
HTML
909
star
5

umap-js

JavaScript implementation of UMAP
JavaScript
375
star
6

llm-comparator

LLM Comparator is an interactive data visualization tool for evaluating and analyzing LLM responses side-by-side, developed by the PAIR team.
JavaScript
286
star
7

knowyourdata

A tool to help researchers and product teams understand datasets with the goal of improving data quality, and mitigating fairness and bias issues.
CSS
281
star
8

wordcraft

โœจโœ๏ธ Wordcraft is an AI-powered text editor with an emphasis on short story writing
TypeScript
239
star
9

scatter-gl

Interactive 3D / 2D webgl-accelerated scatter plot point renderer
TypeScript
168
star
10

understanding-umap

Understanding the theory behind UMAP
JavaScript
164
star
11

federated-learning

Federated learning experiment using TensorFlow.js
TypeScript
160
star
12

interpretability

PAIR.withgoogle.com and friend's work on interpretability methods
JavaScript
147
star
13

ai-explorables

https://pair.withgoogle.com/explorables/
Jupyter Notebook
59
star
14

cococo

๐„ก Collaborative Convolutional Counterpoint
TypeScript
46
star
15

cam-scroller

Cam Scroller is an open-source Chrome extension that uses your webcam and deeplearn.js to enable scrolling through webpages using custom gestures that you define.
JavaScript
33
star
16

font-explorer

Font latent space explorer using tensorflow.js
Vue
32
star
17

clinical-vis

A javascript medical record visualization (https://arxiv.org/abs/1810.05798)
HTML
26
star
18

megaplot

TypeScript
19
star
19

depth-maps-art-and-illusions

TypeScript
18
star
20

pair-code.github.io

HTML
18
star
21

farsight

In situ interactive widgets for responsible AI ๐ŸŒฑ
TypeScript
17
star
22

tiny-transformers

Jupyter Notebook
16
star
23

recommendation-rudders

TypeScript
13
star
24

covid19_symptom_dataset

JavaScript
12
star
25

thehardway

Supplementary code repository to accompany Tic-Tac-Toe the Hard Way podcast
JavaScript
11
star
26

jax-recommenders

Python
9
star
27

autonotes

AutoNotes is an experimental prototype for AI-powered notetaking, with features including hierarchical tagging, "chat with your notes," and highlights.
TypeScript
8
star
28

book-viz

Visualizing multilevel structure in books with sentence embeddings.
Jupyter Notebook
6
star
29

model-alignment

Model Alignment is a python library from the PAIR team that enable users to create model prompts through user feedback instead of manual prompt writing and editing. The technique makes use of constitutional principles to align prompts to users' desired values.
Python
6
star
30

waterfall-of-meaning

TypeScript
5
star
31

deliberate-lab

Platform for running online research experiments on human + LLM group dynamics.
TypeScript
4
star
32

deeplearnjs-legacy-loader

Deprecated: Legacy TensorFlow model loader for deeplearn.js
Python
3
star
33

colormap

JavaScript
3
star
34

adversarial-nibbler-vis

An interactive visualization interface for exploring and analyzing the Adversarial Nibbler dataset
TypeScript
3
star
35

auto-histograms

Python
2
star
36

ml-vis-experiments

Jupyter Notebook
1
star
37

deeplearnjs-docs

TypeScript
1
star