• Stars
    star
    493
  • Rank 87,631 (Top 2 %)
  • Language
    R
  • License
    Other
  • Created over 15 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A R package for splitting, applying and combining large problems into simpler problems

plyr

Lifecycle: retired CRAN status R-CMD-check Codecov test coverage

plyr is a set of tools for a common set of problems: you need to split up a big data structure into homogeneous pieces, apply a function to each piece and then combine all the results back together. For example, you might want to:

  • fit the same model each patient subsets of a data frame
  • quickly calculate summary statistics for each group
  • perform group-wise transformations like scaling or standardising

It's already possible to do this with base R functions (like split and the apply family of functions), but plyr makes it all a bit easier with:

  • totally consistent names, arguments and outputs
  • convenient parallelisation through the foreach package
  • input from and output to data.frames, matrices and lists
  • progress bars to keep track of long running operations
  • built-in error recovery, and informative error messages
  • labels that are maintained across all transformations

Considerable effort has been put into making plyr fast and memory efficient, and in many cases plyr is as fast as, or faster than, the built-in equivalents.

A detailed introduction to plyr has been published in JSS: "The Split-Apply-Combine Strategy for Data Analysis", http://www.jstatsoft.org/v40/i01/. You can find out more at https://had.co.nz/plyr/, or track development at https://github.com/hadley/plyr. You can ask questions about plyr (and data manipulation in general) on the plyr mailing list. Sign up at https://groups.google.com/group/manipulatr.

Status Lifecycle: retired

plyr is retired: this means only changes necessary to keep it on CRAN will be made. We recommend using dplyr (for data frames) or purrr (for lists) instead.

More Repositories

1

r4ds

R for data science: a book
R
4,407
star
2

adv-r

Advanced R: a book
TeX
2,248
star
3

stats337

Readings in applied data science
R
1,626
star
4

ggplot2-book

ggplot2: elegant graphics for data analysis
Perl
1,511
star
5

mastering-shiny

Mastering Shiny: a book
R
1,323
star
6

r-pkgs

Building R packages
R
765
star
7

tidy-data

A paper on data tidying
TeX
404
star
8

emo

Easily insert emoji into R and RMarkdown
R
396
star
9

r-internals

Documentation for R's internal C API
336
star
10

bigvis

Exploratory data analysis for large datasets (10-100 million observations)
C++
286
star
11

strict

Make R a little bit stricter
R
219
star
12

data-baby-names

Distribution of US baby names, 1880-2008
R
207
star
13

reshape

An R package to flexible rearrange, reshape and aggregate data
R
206
star
14

data-movies

Download data from IMDB movies and parse into useful form
Ruby
203
star
15

pryr

Pry open the covers of R
R
201
star
16

assertthat

User friendly assertions for R
R
200
star
17

r2d3

ggplot2 + d3 = r2d3
JavaScript
183
star
18

babynames

An R package containing US baby names from the SSA
R
131
star
19

lazyeval

Lazy evaluation: an alternative to non-standard evaluation (NSE) for R
R
131
star
20

secure

Secure private R data in public packages
R
105
star
21

purrrlyr

Tools at the intersection of purrr and dplyr
C++
103
star
22

lineprof

Visualise line profiling results in R
JavaScript
102
star
23

requirements

Find packages required for code to run
R
75
star
24

ggstat

Statistical computations for visualisation
C++
70
star
25

r-python

Exploring data related to relative usage of R vs. python
R
68
star
26

gg2v

Render ggplot2 graphics using vega
JavaScript
67
star
27

building-permits

Code & data accompanying "whole-game" youtube video
66
star
28

stringb

A dependency-free version of stringr
R
65
star
29

precis

Succintly Summarise Data Frames
R
63
star
30

r-on-github

An exploration of R code and package on github, using the github search and repo apis
R
54
star
31

data-housing-crisis

Clean data related to the housing crisis
R
53
star
32

decumar

An alternative to sweave
R
49
star
33

tidy-tools

Building tidy tools in R, a workshop
R
49
star
34

neiss

Data from National Electronic Injury Surveillance System
HTML
48
star
35

monads

Work with Monads in R
R
47
star
36

joy-of-fp

Supplemental materials for "The joy of functional programming"
R
45
star
37

crantastic

Source code for crantastic.org: a community site for R
Ruby
44
star
38

recipes

Wickham family recipes
R
43
star
39

oldbookdown

R
39
star
40

cubelyr

A data cube dplyr backend
R
36
star
41

data-fuel-economy

Fuel economy data, 1978-2008
35
star
42

table-shapes

34
star
43

lvplot

Letter value boxplots for R
R
34
star
44

usdanutrients

USDA nutrient database as an R data package
R
34
star
45

reactive-docs

An introduction to reactive documents in R (for teaching stats)
34
star
46

vis-eda

Visualisation for EDA
R
32
star
47

rsmith

A static site generator for R inspired by metalsmith.io
R
32
star
48

helpr

An alternative html help system for R
R
31
star
49

sfhousing

Code to download and process SF housing sales data
R
31
star
50

profr

An alternative profiling package for R
R
30
star
51

cocktails

Hadley's cocktail book
R
29
star
52

productplots

Product graphics for categorical data
R
29
star
53

shinySignals

R
29
star
54

data-counties

County boundaries in csv for all US counties
R
28
star
55

l1tf

L1 trend filtering
C
27
star
56

ggplot1

Before there was ggplot2
R
26
star
57

roxygen3

R
23
star
58

15-state-of-the-union

R
22
star
59

minby

Compute minimum of one variable grouped by another
R
21
star
60

mylittlepony

A package for learning about the basics of package development
R
19
star
61

tidyverse-booster

R
19
star
62

hadley.github.com

Personal blog
JavaScript
18
star
63

boxplots-paper

TeX
18
star
64

mturkr

Tools to make MTurk tasks easy to run from R
R
18
star
65

monthApp

An example of a Shiny app-package
R
18
star
66

docker

My personal dockerfiles
17
star
67

fueleconomy

EPA fuel economy data in an R package
R
16
star
68

meifly

An R package for exploring ensembles of (generalised) linear models
R
16
star
69

clusterfly

An R package for visualising high-dimensional clustering algorithms
R
16
star
70

rminds

Sample R code for visualising models (especially models in data space)
16
star
71

sinartra

R
15
star
72

eggnogr

Shiny app for scaling eggnog
R
14
star
73

beautiful-data

Book chapter for beautiful data
14
star
74

15-student-papers

Graphics & computing student paper winners @ JSM 2015
R
14
star
75

fec-dplyr

Exploration of FEC contributions data with dplyr
R
13
star
76

mexico-mortality

Mortality data for Mexico, along with useful extra data
R
13
star
77

grouperise

Explore the idea of "grouperised" functions
C
13
star
78

yrbss

Youth Risk Behaviour Surveillance System Data
R
12
star
79

mutatr

Prototype-based mutable objects for R, based on io and javascript
R
12
star
80

lvplot-paper

TeX
12
star
81

tanglekit

R bindings for Brett Victor's tangle.js
JavaScript
11
star
82

nasaweather

Data from the 2006 ASA data expo
R
11
star
83

ggplot2-bayarea

Data, code and slides for ggplot2 talk given to Bay Area useR group, 17 Sep 2009
R
11
star
84

htmlbook

Convert a Quarto book to O'Reilly's html book format
HTML
11
star
85

proto

Prototype Object-Based Programming
R
10
star
86

vita

HTML
10
star
87

classifly

An R package to visualise high-dimensional classification boundaries with GGobi
R
10
star
88

ideas

Research ideas
10
star
89

cran-logs-dplyr

An case study using dplyr on a large dataset: all package downloads from the Rstudio cran mirror.
R
9
star
90

scagnostics

An R package to calculate graph theoretic scagnostics
C++
9
star
91

ggplot2movies

What the package does (one paragraph).
R
9
star
92

tidycore

Core tidyverse packages
R
9
star
93

densityvis

R package for cutting and binning data
R
9
star
94

fortify

Convert any R object to a data frame, suitable for visualisation
R
9
star
95

hadladdin

RStudio add-ins by Hadley
R
9
star
96

hadcol

Hadley's utilities for adding columns
R
9
star
97

talk-httr2

R
9
star
98

localmds

Local multidimensional scaling, an R package
8
star
99

layers

Layers code extracted out of ggplot2
R
8
star
100

cranatics

Data about which cran maintainer accepted which package
R
8
star