• Stars
    star
    1,251
  • Rank 37,308 (Top 0.8 %)
  • Language
    JavaScript
  • License
    BSD 3-Clause "New...
  • Created about 4 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Query processing and transformation of array-backed data tables.

Arquero

Arquero is a JavaScript library for query processing and transformation of array-backed data tables. Following the relational algebra and inspired by the design of dplyr, Arquero provides a fluent API for manipulating column-oriented data frames. Arquero supports a range of data transformation tasks, including filter, sample, aggregation, window, join, and reshaping operations.

  • Fast: process data tables with million+ rows.
  • Flexible: query over arrays, typed arrays, array-like objects, or Apache Arrow columns.
  • Full-Featured: perform a variety of wrangling and analysis tasks.
  • Extensible: add new column types or functions, including aggregate & window operations.
  • Lightweight: small size, minimal dependencies.

To get up and running, start with the Introducing Arquero tutorial, part of the Arquero notebook collection.

Have a question or need help? Post to the Arquero GitHub Discussions board.

Arquero is Spanish for "archer": if datasets are arrows, Arquero helps their aim stay true. 🏹 Arquero also refers to a goalkeeper: safeguard your data from analytic "own goals"! 🥅 ✋ ⚽

API Documentation

  • Top-Level API - All methods in the top-level Arquero namespace.
  • Table - Table access and output methods.
  • Verbs - Table transformation verbs.
  • Op Functions - All functions, including aggregate and window functions.
  • Expressions - Parsing and generation of table expressions.
  • Extensibility - Extend Arquero with new expression functions or table verbs.

Example

The core abstractions in Arquero are data tables, which model each column as an array of values, and verbs that transform data and return new tables. Verbs are table methods, allowing method chaining for multi-step transformations. Though each table is unique, many verbs reuse the underlying columns to limit duplication.

import { all, desc, op, table } from 'arquero';

// Average hours of sunshine per month, from https://usclimatedata.com/.
const dt = table({
  'Seattle': [69,108,178,207,253,268,312,281,221,142,72,52],
  'Chicago': [135,136,187,215,281,311,318,283,226,193,113,106],
  'San Francisco': [165,182,251,281,314,330,300,272,267,243,189,156]
});

// Sorted differences between Seattle and Chicago.
// Table expressions use arrow function syntax.
dt.derive({
    month: d => op.row_number(),
    diff:  d => d.Seattle - d.Chicago
  })
  .select('month', 'diff')
  .orderby(desc('diff'))
  .print();

// Is Seattle more correlated with San Francisco or Chicago?
// Operations accept column name strings outside a function context.
dt.rollup({
    corr_sf:  op.corr('Seattle', 'San Francisco'),
    corr_chi: op.corr('Seattle', 'Chicago')
  })
  .print();

// Aggregate statistics per city, as output objects.
// Reshape (fold) the data to a two column layout: city, sun.
dt.fold(all(), { as: ['city', 'sun'] })
  .groupby('city')
  .rollup({
    min:  d => op.min(d.sun), // functional form of op.min('sun')
    max:  d => op.max(d.sun),
    avg:  d => op.average(d.sun),
    med:  d => op.median(d.sun),
    // functional forms permit flexible table expressions
    skew: ({sun: s}) => (op.mean(s) - op.median(s)) / op.stdev(s) || 0
  })
  .objects()

Usage

In Browser

To use in the browser, you can load Arquero from a content delivery network:

<script src="https://cdn.jsdelivr.net/npm/arquero@latest"></script>

Arquero will be imported into the aq global object. The default browser bundle does not include the Apache Arrow library. To perform Arrow encoding using toArrow() or binary file loading using loadArrow(), import Apache Arrow first:

<script src="https://cdn.jsdelivr.net/npm/apache-arrow@latest"></script>
<script src="https://cdn.jsdelivr.net/npm/arquero@latest"></script>

Alternatively, you can build and import arquero.min.js from the dist directory, or build your own application bundle. When building custom application bundles for the browser, the module bundler should draw from the browser property of Arquero's package.json file. For example, if using rollup, pass the browser: true option to the node-resolve plugin.

Arquero uses modern JavaScript features, and so will not work with some outdated browsers. To use Arquero with older browsers including Internet Explorer, set up your project with a transpiler such as Babel.

In Node.js or Application Bundles

First install arquero as a dependency, via npm install arquero --save or yarn add arquero. Arquero assumes Node version 12 or higher.

Import using CommonJS module syntax:

const aq = require('arquero');

Import using ES module syntax, import all exports into a single object:

import * as aq from 'arquero';

Import using ES module syntax, with targeted imports:

import { op, table } from 'arquero';

Build Instructions

To build and develop Arquero locally:

More Repositories

1

visualization-curriculum

A data visualization curriculum of interactive notebooks.
Jupyter Notebook
1,275
star
2

mosaic

An extensible framework for linking databases and interactive views.
JavaScript
688
star
3

draco

Visualization Constraints and Weight Learning
TypeScript
222
star
4

d3-tutorials

D3 Tutorials for CSE512 Data Visualization Course at University of Washington
HTML
170
star
5

imMens

Real-Time Visual Querying of Big Data
HTML
168
star
6

living-papers

Authoring tools for scholarly communication. Create interactive web pages or formal research papers from markdown source.
TeX
128
star
7

termite-data-server

Data Server for Topic Models
Python
120
star
8

errudite

An Interactive Tool for Scalable and Reproducible Error Analysis.
Python
104
star
9

gemini

A grammar and recommender system for animated transitions in Vega/Vega-Lite
JavaScript
103
star
10

vsup

Code for generating Value-Suppressing Uncertainty Palettes for use in D3 charts.
JavaScript
77
star
11

latent-space-cartography

Visual analysis of vector space embeddings
HTML
74
star
12

setcola

High-Level Constraints for Graph Layout
JavaScript
72
star
13

boba

Specifying and executing multiverse analysis
Python
62
star
14

termite-visualizations

[development moved to termite-data-server]
Python
61
star
15

rev

REV: Reverse-Engineering Visualizations
Python
60
star
16

graphscape

A directed graph model of the visualization design space, using Vega-Lite.
JavaScript
58
star
17

fast-kde

Fast, approximate Gaussian kernel density estimation.
JavaScript
56
star
18

bayesian-surprise

Bayesian Weighting for De-Biasing Thematic Maps
TeX
54
star
19

gestrec

A JavaScript implementation of the Protractor gesture recognizer.
JavaScript
36
star
20

perceptual-kernels

Data & source code for the perceptual kernels study
HTML
33
star
21

ellipsis

Visualization Storytelling Components
JavaScript
31
star
22

visual-embedding

Data & source code for the visual embedding model
MATLAB
31
star
23

boba-visualizer

A visual analysis tool for exploring multiverse outcomes
JavaScript
31
star
24

papers-vsup

Visualize uncertainty
TeX
27
star
25

arquero-sql

Database backend support for Arquero
JavaScript
24
star
26

color-naming-in-different-languages

JavaScript
24
star
27

arquero-worker

Worker thread support for Arquero.
JavaScript
22
star
28

living-papers-template

A Living Papers article starter template.
22
star
29

mosaic-framework-example

Using Mosaic and DuckDB within Observable Framework
TypeScript
22
star
30

dziban

Context-Aware, Recommender-Powered Visualization Authoring
Jupyter Notebook
21
star
31

draco-vis

Draco on the web
TypeScript
18
star
32

flechette

Fast, lightweight access to Apache Arrow data.
JavaScript
18
star
33

diagnostics

Topic Model Diagnostics
JavaScript
14
star
34

vegaserver

A simple node server that renders vega specs to SVG or PNG.
JavaScript
13
star
35

visual-encoding-effectiveness-data

Supplement material for "Assessing Effects of Task and Data Distribution on the Effectiveness of Visual Encodings".
JavaScript
13
star
36

divi

Automatically interact with SVG charts.
JavaScript
10
star
37

quantitative-color-data

Data for quantitative colormap study
R
10
star
38

citation-query

Retrieve paper citatation data from doi.org and Semantic Scholar.
JavaScript
10
star
39

arquero-arrow

Arrow serialization support for Arquero.
JavaScript
9
star
40

verp

The VERP Explorer
JavaScript
8
star
41

termite-stm

[development moved to termite-data-server]
Python
8
star
42

code-augmentation

Code augmentation editor
JavaScript
7
star
43

aggregate-animation-data

Supplement material for "Designing Animated Transitions to Convey Aggregate Operations"
JavaScript
7
star
44

vega-dataflow

Reactive dataflow processing.
JavaScript
7
star
45

trend-bias

Experiments on trend-fitting
TeX
6
star
46

termite-treetm

[development moved to termite-data-server]
Python
6
star
47

flights-arrow

Flight Dataset as Apache Arrow in Different Sizes
6
star
48

living-papers-paper

The UIST'23 Living Papers research paper and supplemental material.
JavaScript
5
star
49

fast-kde-benchmarks

Research archive of methods and benchmarks for fast, approximate Gaussian kernel density estimation.
JavaScript
5
star
50

gemini-supplemental-material

Supplemental material for "Gemini: A Grammar and Recommender System for Animated Transitions in Statistical Graphics"
HTML
5
star
51

uwdata.github.io

UW Interactive Data Lab web page
Svelte
5
star
52

palette-analyzer

Analyzes the local and global distances in [RGB, LAB, UCS, Color Names] model, given a palette.
HTML
5
star
53

draco-learn

Learning Weights for Draco
Python
4
star
54

draco-editor

The Draco Online Editor
CSS
4
star
55

datalib

We've moved! Please see https://github.com/vega/datalib
3
star
56

file-cache

File-based cache for JSON-serializable data.
JavaScript
3
star
57

istc-explorer

JavaScript
2
star
58

draco-analysis

Notebooks for Draco
Jupyter Notebook
2
star
59

draco-tools

Tools for Draco
JavaScript
2
star
60

living-papers-examples

Example Living Papers Articles
JavaScript
2
star
61

draco-tuner

An interactive application to modify Draco's knowledge base
TypeScript
1
star