• Stars
    star
    723
  • Rank 62,657 (Top 2 %)
  • Language
    TypeScript
  • License
    MIT License
  • Created almost 4 years ago
  • Updated 6 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Tidy up your data with JavaScript, inspired by dplyr and the tidyverse

tidy.js

CircleCI npm

Tidy up your data with JavaScript! Inspired by dplyr and the tidyverse, tidy.js attempts to bring the ergonomics of data manipulation from R to javascript (and typescript). The primary goals of the project are:

  • Readable code. Tidy.js prioritizes making your data transformations readable, so future you and your teammates can get up and running quickly.

  • Standard transformation verbs. Tidy.js is built using battle-tested verbs from the R community that can handle any data wrangling need.

  • Work with plain JS objects. No wrapper classes needed — all tidy.js needs is an array of plain old-fashioned JS objects to get started. Simple in, simple out.

Secondarily, this project aims to provide acceptable types for the functions provided.

Quick Links

Related work

Be sure to check out a very similar project, Arquero, from UW Data.

Getting started

To start using tidy, your best bet is to install from npm:

npm install @tidyjs/tidy
# or
yarn add @tidyjs/tidy

Then import the functions you need:

import { tidy, mutate, arrange, desc } from '@tidyjs/tidy'

Note if you're just trying tidy in a browser, you can use the UMD version hosted on jsdelivr (codesandbox example):

<script src="https://d3js.org/d3-array.v2.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/@tidyjs/tidy/dist/umd/tidy.min.js"></script>
<script>
  const { tidy, mutate, arrange, desc } = Tidy;
  // ...
</script>  

And use them on an array of objects:

const data = [
  { a: 1, b: 10 }, 
  { a: 3, b: 12 }, 
  { a: 2, b: 10 }
]

const results = tidy(
  data, 
  mutate({ ab: d => d.a * d.b }),
  arrange(desc('ab'))
)

The output is:

[
  { a: 3, b: 12, ab: 36},
  { a: 2, b: 10, ab: 20},
  { a: 1, b: 10, ab: 10}
]

All tidy.js code is wrapped in a tidy flow via the tidy() function. The first argument is the array of data, followed by the transformation verbs to run on the data. The actual functions passed to tidy() can be anything so long as they fit the form:

(items: object[]) => object[]

For example, the following is valid:

tidy(
  data, 
  items => items.filter((d, i) => i % 2 === 0),
  arrange(desc('value'))
)

All tidy verbs fit this style, with the exception of exports from groupBy, discussed below.

Grouping data with groupBy

Besides manipulating flat lists of data, tidy provides facilities for wrangling grouped data via the groupBy() function.

import { tidy, summarize, sum, groupBy } from '@tidyjs/tidy'

const data = [
  { key: 'group1', value: 10 }, 
  { key: 'group2', value: 9 }, 
  { key: 'group1', value: 7 }
]

tidy(
  data,
  groupBy('key', [
    summarize({ total: sum('value') })
  ])
)

The output is:

[
  { "key": "group1", "total": 17 },
  { "key": "group2", "total": 9 },
]

The groupBy() function works similarly to tidy() in that it takes a flow of functions as its second argument (wrapped in an array). Things get really fun when you use groupBy's third argument for exporting the grouped data into different shapes.

For example, exporting data as a nested object, we can use groupBy.object() as the third argument to groupBy().

const data = [
  { g: 'a', h: 'x', value: 5 },
  { g: 'a', h: 'y', value: 15 },
  { g: 'b', h: 'x', value: 10 },
  { g: 'b', h: 'x', value: 20 },
  { g: 'b', h: 'y', value: 30 },
]

tidy(
  data,
  groupBy(
    ['g', 'h'], 
    [
      mutate({ key: d => `\${d.g}\${d.h}`})
    ], 
    groupBy.object() // <-- specify the export
  )
);

The output is:

{
  "a": {
    "x": [{"g": "a", "h": "x", "value": 5, "key": "ax"}],
    "y": [{"g": "a", "h": "y", "value": 15, "key": "ay"}]
  },
  "b": {
    "x": [
      {"g": "b", "h": "x", "value": 10, "key": "bx"},
      {"g": "b", "h": "x", "value": 20, "key": "bx"}
    ],
    "y": [{"g": "b", "h": "y", "value": 30, "key": "by"}]
  }
}

Or alternatively as { key, values } entries-objects via groupBy.entriesObject():

tidy(data,
  groupBy(
    ['g', 'h'], 
    [
      mutate({ key: d => `\${d.g}\${d.h}`})
    ], 
    groupBy.entriesObject() // <-- specify the export
  )
);

The output is:

[
  {
    "key": "a",
    "values": [
      {"key": "x", "values": [{"g": "a", "h": "x", "value": 5, "key": "ax"}]},
      {"key": "y", "values": [{"g": "a", "h": "y", "value": 15, "key": "ay"}]}
    ]
  },
  {
    "key": "b",
    "values": [
      {
        "key": "x",
        "values": [
          {"g": "b", "h": "x", "value": 10, "key": "bx"},
          {"g": "b", "h": "x", "value": 20, "key": "bx"}
        ]
      },
      {"key": "y", "values": [{"g": "b", "h": "y", "value": 30, "key": "by"}]}
    ]
  }
]

It's common to be left with a single leaf in a groupBy set, especially after running summarize(). To prevent your exported data having its values wrapped in an array, you can pass the single option to it.

tidy(input,
  groupBy(['g', 'h'], [
    summarize({ total: sum('value') })
  ], groupBy.object({ single: true }))
);

The output is:

{
  "a": {
    "x": {"total": 5, "g": "a", "h": "x"},
    "y": {"total": 15, "g": "a", "h": "y"}
  },
  "b": {
    "x": {"total": 30, "g": "b", "h": "x"},
    "y": {"total": 30, "g": "b", "h": "y"}
  }
}

Visit the API reference docs to learn more about how each function works and all the options they take. Be sure to check out the levels export, which can let you mix-and-match different export types based on the depth of the data. For quick reference, other available groupBy exports include:

  • groupBy.entries()
  • groupBy.entriesObject()
  • groupBy.grouped()
  • groupBy.levels()
  • groupBy.object()
  • groupBy.keys()
  • groupBy.map()
  • groupBy.values()

Developing

clone the repo:

git clone [email protected]:pbeshai/tidy.git

install dependencies:

yarn

initialize lerna:

lerna bootstrap

build tidy:

yarn run build

test all of tidy:

yarn run test

test:watch a single package

yarn workspace @tidyjs/tidy test:watch

Conventional commits

This library uses conventional commits, following the angular convention. Prefixes are:

  • build: Changes that affect the build system or external dependencies (example scopes: yarn, npm)
  • ci: Changes to our CI configuration files and scripts (e.g. CircleCI)
  • chore
  • docs: Documentation only changes
  • feat : A new feature
  • fix: A bug fix
  • perf: A code change that improves performance
  • refactor: A code change that neither fixes a bug nor adds a feature
  • revert
  • style: Changes that do not affect the meaning of the code (white-space, formatting, missing semi-colons, etc)
  • test: Adding missing tests or correcting existing tests

Docs website

start the local site:

yarn start:web

build the site:

yarn build:web

deploy the site via github-pages:

USE_SSH=true GIT_USER=pbeshai yarn workspace @tidyjs/tidy-website deploy

Ideally we can automate this via github actions one day!


Shout out to Netflix

I want to give a big shout out to Netflix, my current employer, for giving me the opportunity to work on this project and to open source it. It's a great place to work and if you enjoy tinkering with data-related things, I'd strongly recommend checking out our analytics department. – Peter Beshai

More Repositories

1

use-query-params

React Hook for managing state in URL query parameters with easy serialization.
TypeScript
2,133
star
2

d3-interpolate-path

Interpolates path `d` attribute smoothly when A and B have different number of points.
JavaScript
315
star
3

react-url-query

A library for managing state through query parameters in the URL in React
JavaScript
195
star
4

d3-line-chunked

Create lines that indicate where data is missing with gaps or differently styled chunks/line segments.
JavaScript
120
star
5

3dpie

gaudy 3d pie generator in react and three.js
JavaScript
115
star
6

serialize-query-params

A javascript library for simplifying encoding and decoding URL query parameters
TypeScript
71
star
7

vis-utils

A collection of utility functions for helping with data visualization
JavaScript
51
star
8

linked-highlighting-react-vega-redux

Example of doing linked highlighting with React, Vega, and Redux
JavaScript
51
star
9

react-express-example

A basic example of using Express and Facebook's React for both client-side and server-side rendering.
JavaScript
45
star
10

p5js-ccapture

An example of using p5.js with CCapture.js for exporting frames
HTML
44
star
11

deckgl-point-animation

Demo of animating points in deck.gl
JavaScript
40
star
12

linked-highlighting-react-d3-reflux

An example of doing linked highlighting using React, d3.js and Reflux
JavaScript
39
star
13

d3-scale-interactive

An interactive UI for editing d3 v4 scales in your browser
JavaScript
36
star
14

react-map-demo

Created for ReactJS Boston Meetup, a demo of using react-map-gl and deck.gl to draw a basic map with some different layers.
JavaScript
30
star
15

react-taco-table

A table component for React based on column configuration 🌮
JavaScript
30
star
16

react-basic-vis-examples

A few basic visualization examples that build on each other, using React and D3.js.
JavaScript
28
star
17

react-computed-props

A higher-order component for React to add computed or derived props to the wrapped component for better performance.
JavaScript
26
star
18

peterbeshai.com

HTML
15
star
19

cra-snowpack

Example of using Snowpack's dev server in a Create-React-App app
JavaScript
12
star
20

beshai-makes-code

Code for BeshaiMakes.com Articles
HTML
8
star
21

shots3D

Render basketball shots from the NBA using WebGL
JavaScript
6
star
22

recreating-the-past

Sketches for the Recreating the Past class at the MIT Media Lab (Fall 2019)
C++
5
star
23

stats

Examples for doing stats on experimental data
R
4
star
24

geo-explore

Simple demonstrative examples of mapping techniques for the web
HTML
3
star
25

d3-webpack-loader

Automatically bundle D3 v4 modules under a single `d3` import with D3 Webpack Loader.
JavaScript
3
star
26

rhombus

Rhombus Classroom Synchronous Participation System
Shell
2
star
27

react-library-starter

A starting place for creating a React library or component to share with others
JavaScript
2
star
28

react-autosize

Component to automatically provide width and height props to its child
JavaScript
2
star
29

react-auto-width

Set the width of a Reactjs component to its parent's width
JavaScript
2
star
30

nba-draft

JavaScript
2
star
31

rhombus-id-server

ID Server for Rhombus Classroom Synchronous Participation System
JavaScript
1
star
32

rhombus-web-app-basic

Barebones Web App for Rhombus Classroom Synchronous Participation System
JavaScript
1
star
33

rhombus-web-app-game-theory

Web App with Game Theory games for Rhombus Classroom Synchronous Participation System
JavaScript
1
star
34

rhombus-clicker-server

Clicker Server for Rhombus Classroom Synchronous Participation System
Java
1
star
35

tree-csv

Explore a CSV containing hierarchical data as a tree
JavaScript
1
star
36

rhombus-web-framework

Web Framework for Rhombus Classroom Synchronous Participation System
JavaScript
1
star
37

rhombus-web-app-experiment

Web App for 7-Segment Display Recognition Experiment for Rhombus Classroom Synchronous Participation System
JavaScript
1
star
38

protoweb

Web prototype scaffolding and hot-reloading server
JavaScript
1
star
39

rhombus-web-app-sandbox

Sandbox for various apps for Rhombus Classroom Synchronous Participation System
JavaScript
1
star
40

rhombus-clicker-server-filter-multiple-instructor

Multiple Instructor filter for the Clicker Server for Rhombus Classroom Synchronous Participation System
Java
1
star
41

rhombus-grunt-socket-server

HTTP/Socket.io server used by Rhombus Classroom Synchronous Participation System
JavaScript
1
star