• Stars
    star
    6,092
  • Rank 6,609 (Top 0.2 %)
  • Language
  • License
    Other
  • Created over 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Documentation on how to access and use the Quick, Draw! Dataset.

The Quick, Draw! Dataset

preview

The Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game Quick, Draw!. The drawings were captured as timestamped vectors, tagged with metadata including what the player was asked to draw and in which country the player was located. You can browse the recognized drawings on quickdraw.withgoogle.com/data.

We're sharing them here for developers, researchers, and artists to explore, study, and learn from. If you create something with this dataset, please let us know by e-mail or at A.I. Experiments.

We have also released a tutorial and model for training your own drawing classifier on tensorflow.org.

Please keep in mind that while this collection of drawings was individually moderated, it may still contain inappropriate content.

Content

The raw moderated dataset

The raw data is available as ndjson files seperated by category, in the following format:

Key Type Description
key_id 64-bit unsigned integer A unique identifier across all drawings.
word string Category the player was prompted to draw.
recognized boolean Whether the word was recognized by the game.
timestamp datetime When the drawing was created.
countrycode string A two letter country code (ISO 3166-1 alpha-2) of where the player was located.
drawing string A JSON array representing the vector drawing

Each line contains one drawing. Here's an example of a single drawing:

  { 
    "key_id":"5891796615823360",
    "word":"nose",
    "countrycode":"AE",
    "timestamp":"2017-03-01 20:41:36.70725 UTC",
    "recognized":true,
    "drawing":[[[129,128,129,129,130,130,131,132,132,133,133,133,133,...]]]
  }

The format of the drawing array is as following:

[ 
  [  // First stroke 
    [x0, x1, x2, x3, ...],
    [y0, y1, y2, y3, ...],
    [t0, t1, t2, t3, ...]
  ],
  [  // Second stroke
    [x0, x1, x2, x3, ...],
    [y0, y1, y2, y3, ...],
    [t0, t1, t2, t3, ...]
  ],
  ... // Additional strokes
]

Where x and y are the pixel coordinates, and t is the time in milliseconds since the first point. x and y are real-valued while t is an integer. The raw drawings can have vastly different bounding boxes and number of points due to the different devices used for display and input.

Preprocessed dataset

We've preprocessed and split the dataset into different files and formats to make it faster and easier to download and explore.

Simplified Drawing files (.ndjson)

We've simplified the vectors, removed the timing information, and positioned and scaled the data into a 256x256 region. The data is exported in ndjson format with the same metadata as the raw format. The simplification process was:

  1. Align the drawing to the top-left corner, to have minimum values of 0.
  2. Uniformly scale the drawing, to have a maximum value of 255.
  3. Resample all strokes with a 1 pixel spacing.
  4. Simplify all strokes using the Ramer–Douglas–Peucker algorithm with an epsilon value of 2.0.

There is an example in examples/nodejs/simplified-parser.js showing how to read ndjson files in NodeJS.
Additionally, the examples/nodejs/ndjson.md document details a set of command-line tools that can help explore subsets of these quite large files.

Binary files (.bin)

The simplified drawings and metadata are also available in a custom binary format for efficient compression and loading.

There is an example in examples/binary_file_parser.py showing how to load the binary files in Python.
There is also an example in examples/nodejs/binary-parser.js showing how to read the binary files in NodeJS.

Numpy bitmaps (.npy)

All the simplified drawings have been rendered into a 28x28 grayscale bitmap in numpy .npy format. The files can be loaded with np.load(). These images were generated from the simplified data, but are aligned to the center of the drawing's bounding box rather than the top-left corner. See here for code snippet used for generation.

Get the data

The dataset is available on Google Cloud Storage as ndjson files seperated by category. See the list of files in Cloud , or read more about accessing public datasets using other methods. As an example, to easily download all simplified drawings, one way is to run the command gsutil -m cp 'gs://quickdraw_dataset/full/simplified/*.ndjson' .

Full dataset seperated by categories

Sketch-RNN QuickDraw Dataset

This data is also used for training the Sketch-RNN model. An open source, TensorFlow implementation of this model is available in the Magenta Project, (link to GitHub repo). You can also read more about this model in this Google Research blog post. The data is stored in compressed .npz files, in a format suitable for inputs into a recurrent neural network.

In this dataset, 75K samples (70K Training, 2.5K Validation, 2.5K Test) has been randomly selected from each category, processed with RDP line simplification with an epsilon parameter of 2.0. Each category will be stored in its own .npz file, for example, cat.npz.

We have also provided the full data for each category, if you want to use more than 70K training examples. These are stored with the .full.npz extensions.

Note: For Python3, loading the npz files using np.load(data_filepath, encoding='latin1', allow_pickle=True)

Instructions for converting Raw ndjson files to this npz format is available in this notebook.

Projects using the dataset

Here are some projects and experiments that are using or featuring the dataset in interesting ways. Got something to add? Let us know!

Creative and artistic projects

Data analyses

Papers

Guides & Tutorials

Code and tools

Changes

May 25, 2017: Updated Sketch-RNN QuickDraw dataset, created .full.npz complementary sets.

License

This data made available by Google, Inc. under the Creative Commons Attribution 4.0 International license.

Dataset Metadata

The following table is necessary for this dataset to be indexed by search engines such as Google Dataset Search.

property value
name The Quick, Draw! Dataset
alternateName Quick Draw Dataset
alternateName quickdraw-dataset
url
sameAs https://github.com/googlecreativelab/quickdraw-dataset
description The Quick Draw Dataset is a collection of 50 million drawings across 345 categories, contributed by players of the game "Quick, Draw!". The drawings were captured as timestamped vectors, tagged with metadata including what the player was asked to draw and in which country the player was located.\n \n Example drawings: ![preview](https://raw.githubusercontent.com/googlecreativelab/quickdraw-dataset/master/preview.jpg)
provider
property value
name Google
sameAs https://en.wikipedia.org/wiki/Google
license
property value
name CC BY 4.0
url

More Repositories

1

anypixel

A web-friendly way for anyone to build unusual displays
C
6,437
star
2

teachable-machine-v1

Explore how machine learning works, live in the browser. No coding required.
JavaScript
3,848
star
3

coder

A simple way to make web stuff on Raspberry Pi
JavaScript
2,425
star
4

open-nsynth-super

Open NSynth Super is an experimental physical interface for the NSynth algorithm
C++
2,419
star
5

chrome-music-lab

A collection of experiments for exploring how music works, all built with the Web Audio API.
JavaScript
2,127
star
6

aiexperiments-ai-duet

A piano that responds to you.
JavaScript
1,634
star
7

teachablemachine-community

Example code snippets and machine learning code for Teachable Machine
TypeScript
1,488
star
8

paper-signals

Build your own voice controlled object.
C++
743
star
9

aiexperiments-drum-machine

Thousands of everyday sounds, organized using machine learning.
JavaScript
736
star
10

Sprayscape

Sprayscape is a perfectly imperfect VR-ish camera. It is an open source Android app released on the Android Experiments platform.
Objective-C
568
star
11

teachable-machine-boilerplate

Boilerplate code for Teachable Machine
JavaScript
504
star
12

aiexperiments-giorgio-cam

Take a picture to make music with the computer.
JavaScript
481
star
13

aiexperiments-bird-sounds

Thousands of bird sounds visualized using machine learning.
JavaScript
474
star
14

ar-drawing-java

A simple AR drawing experiment build in Java using ARCore.
Java
415
star
15

inside-music

Inside Music lets you step inside of a song, seeing its individual pieces to give you a closer look at how music is made.
JavaScript
384
star
16

meter

Meter is a data-driven wallpaper that displays your battery, wireless signal and notifications
Java
362
star
17

digital-wellbeing-experiments-toolkit

Code components for starting your own Digital Wellbeing experiments
Kotlin
341
star
18

giantemoji

JavaScript
313
star
19

creatability-components

Web components for making creative tools more accessible.
TypeScript
287
star
20

morse-learn

A fun little web app to help you learn Morse code on Gboard.
JavaScript
284
star
21

alto

Explore the basics of machine learning by building your own teachable object at home.
Python
278
star
22

aiexperiments-sound-maker

Make unusual new sounds with machine learning.
JavaScript
255
star
23

justaline-android

The first cross-platform collaborative AR app (for doodling)
Java
252
star
24

shadercam

Simple OpenGL Shaders with the camera2 apis in Android 5.0+
Java
240
star
25

posenet-sketchbook

PoseNet Sketchbook is a collection of open source, interactive web experiments designed to allude to the artistic possibilities of using PoseNet (running on tensorflow.js) to create a relationship between movement and machine.
JavaScript
207
star
26

arexperiments-portal-painter

Doodle new worlds onto your own, with Google ARCore.
C#
198
star
27

beat-blender

Blend beats using machine learning to create music in a fun new way.
JavaScript
190
star
28

coder-projects

Fun projects and sneakily educational things that can all be made with Coder and Rasberry Pi.
CSS
188
star
29

landmarker

Orientation, GPS, and Places enabled Android Experiment
Java
180
star
30

webvr-musicalforest

Join users from around the world in a musical forest. A WebVR Experiment.
JavaScript
164
star
31

balloon-pop

A multiplayer geospatial experience
C#
163
star
32

melody-mixer

A fun way to explore music using machine learning.
JavaScript
153
star
33

lipswap

Replace sections of a photo with your own recorded video.
Java
142
star
34

norman-ar

Decorate your world with AR animations.
C++
136
star
35

justaline-ios

The first cross-platform collaborative AR app (for doodling)
Swift
120
star
36

quickdraw-component

Use any of the of the 50 million Quick, Draw! doodles in your web-based project with one line of markup
JavaScript
104
star
37

creatability-seeing-music

Experience music visually.
JavaScript
101
star
38

semi-conductor

Semi-Conductor allows you to conduct a virtual orchestra using only your web browser & webcam.
JavaScript
100
star
39

tunnelvision

Distort your surroundings through a collection of transformative filters
Java
93
star
40

lines-of-play

Design domino art creations that interact with the real world using the ARCore Depth API.
C#
89
star
41

project-oasis

A voice controlled terrarium that recreates outside weather inside a box
JavaScript
89
star
42

access-mars

JavaScript
82
star
43

tiny-motion-trainer

Train and test machine learning models for your Arduino Nano 33 BLE Sense in the browser.
JavaScript
79
star
44

mystery-animal

A new spin on the classic 20-questions game.
JavaScript
79
star
45

mix-lab

MixLab is an experiment that makes it easier for anyone to create music, using simple voice commands.
TypeScript
71
star
46

sounds-in-space

An interactive audio experience, where the virtual sounds you hear change depending on your physical location.
C#
71
star
47

tf4micro-motion-kit

Arduino Sketch and a Web Bluetooth API for loading models and running inference on the Nano Sense 33 BLE device.
C++
66
star
48

obvi

A Polymer 3+ webcomponent / button for doing speech recognition
JavaScript
57
star
49

finger-user-interface

Control connected devices with the wave of a finger.
C
54
star
50

xyfi

Xyfi: BYO pointing device at a physical installation
JavaScript
48
star
51

morse-speak-demo

Text-to-Speech (TTS) demo web app that converts written text into spoken words via Morse code
JavaScript
44
star
52

aog-canvas-quiz

Canvas Quiz is a starter kit for developers to make custom, voice-enabled question-answer games for the Google Assistant.
JavaScript
40
star
53

air-snare

Play drums in the air.
Svelte
40
star
54

webvr-speaktogo

Explore the world with your voice.
JavaScript
34
star
55

norman-sketch-player

Embed Norman animated sketches on the web
JavaScript
29
star
56

pattern-radio

Code for patternradio.withgoogle.com
JavaScript
27
star
57

things-with-firebase-at-io2017

The Android Things projects used in the Experiments Tent at Google I/O 2017
Java
22
star
58

astrowand

Draw shapes in the sky to form constellations with TensorFlow and a microcontroller.
JavaScript
19
star
59

visual-alarm-clock

Get up in the morning by striking a pose to stop your alarm from ringing.
C++
10
star
60

dat.fire

JavaScript
6
star
61

gemini-demos

TypeScript
2
star
62

.allstar

2
star
63

.github

1
star