• Stars
    star
    662
  • Rank 67,596 (Top 2 %)
  • Language
    R
  • License
    Other
  • Created over 8 years ago
  • Updated almost 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A guide to some of the most useful R Packages that we know about

RStartHere

A guide to some of the most useful R Packages that we know about, organized by their role in data science.

Click here to suggest packages.

Data Science Workflow

Each data science project is different, but each follows the same general steps. You:

"The data science workflow"

  1. Import your data into R

  2. Tidy it

  3. Understand your data by iteratively

    1. visualizing
    2. tranforming and
    3. modeling your data
  4. Infer how your understanding applies to other data sets (including future data, i.e. predictions)

  5. Communicate your results to an audience, or

  6. Automate your analysis for easy reuse

  7. Program the whole way through, since you do each of these things on a computer

Below we list the most useful R packages that we know of for each step.

Import

These packages help you import data into R and save data.

  • feather - a fast, lightweight file format used by both R and Python
  • readr - reads tabular data
  • readxl - reads Microsoft Excel spreadsheets
  • openxlsx - reads Microsoft Excel spreadsheets
  • googlesheets - reads Google spreadsheets
  • haven - reads SAS, SPSS, and Stata files
  • httr - reads data from web APIs
  • rvest - scrapes data from web pages
  • xml2 - reads HTML and XML data
  • webreadr - reads common web log formats
  • DBI - a universal interface to database management systems (DBMS)
  • PivotalR - reads data from and interfaces with Postgres, Greenplum, and HAWQ
  • dplyr - contains an interface to common databases
  • data.table - fread() for fast table reading
  • git2r - tools to access git repositories
  • BioInstaller - Downloader for biological software and database.

Tidy

These packages help you wrangle your data into a form that is easy to analyze in R.

  • tidyr - tools for tidying layout of tabular data
  • dplyr - tools for joining multiple tables into a tidy data set
  • purrr - tools for applying R functions to data structures, very useful when tidying
  • broom - tools for tidying statistical models into data frames
  • zoo - data structures for time series data
  • PivotalR - R wrappers for in-database SQL operations (i.e. join, group by)

Visualize

These packages help you visualize your data.

  • ggplot2 with extensions - a versatile system for making plots
    • ggthemes - plot style themes
    • ggmap - maps with Google Maps, Open Street Maps, etc.
    • ggiraph - interactive ggplots
    • ggstance - horizontal versions of common plots
    • GGally - scatterplot matrices
    • ggalt - additional coordinate systems, geoms, etc.
    • ggforce - additional geoms, etc.
    • ggrepel - prevent plot labels from overlapping
    • ggraph - graphs, networks, trees and more
    • ggpmisc - photo-biology related extensions
    • geomnet - network visualization
    • ggExtra - marginal histograms for a plot
    • gganimate - animations
    • plotROC - interactive ROC plots
    • ggspectra - tools for plotting light spectra
    • ggnetwork - geoms to plot networks
    • ggtech - style themes for plots
    • ggradar - radar charts
    • ggTimeSeries - time series visualizations
    • ggtree - tree visualizations
    • ggseas - seasonal adjustment tools
  • lattice - Trellis graphics
  • rgl - interactive 3D plots
  • ggvis - versatile system for interactive graphs
  • htmlwidgets - framework for creating JavaScript widgets with R
  • rCharts - many interactive JavaScript visualizations
  • coefplot - visualizes model statistics
  • quantmod - candlestick financial charts
  • colorspace - HSL based color palettes
  • viridis - Matplotlib viridis color pallete for R
  • munsell - Munsell color palettes for R.
  • RColorBrewer - color palettes for plots. No manual or website.
  • dichromat - color-blind friendly palettes. No manual or website.
  • igraph - Network Analysis and Visualization
  • latticeExtra - Extensions for lattice graphics
  • sp - tools for spatial data

Transform

These packages help you transform your data into new types of data.

  • dplyr - a grammar of data transformation
  • magrittr - a concise syntax for calling sequences of functions
  • tibble - efficient display structure for tabular data
  • stringr - tools for working with strings and regular expressions
  • lubridate - tools for working with dates and times
  • xts - tools for time series based data
  • data.table - fast data manipulation
  • vtreat - tools for pre-processing variables for predictive modeling
  • stringi - fast string processing facilities.
  • Matrix - LAPACK methods for dense and sparse matrix operations

Model/Infer

These packages help you build models and make inferences. Often the same packages will focus on both topics.

  • car - functions from An R Companion to Applied Regression
  • Hmisc - miscellaneous functions for data analysis
  • multcomp - Simultaneous Inference in General Parametric Models
  • pbkrtest - parametric bootstrap test for linear mixed effects models
  • mvtnorm - Multivariate Normal and t Distributions
  • MatrixModels - Modelling with Sparse And Dense Matrices
  • SparseM - linear algebra for sparse matrices
  • lme4 - Linear Mixed-Effects Models using Eigen C++ library
  • broom - tools for tidying statistical models into data frames
  • caret - tools for Classification And REgression Training
  • glmnet - generalized linear models via penalized maximum likelihood
  • mosaic - Tools for teaching mathematics, statistics, computation and modeling
  • gbm - gradient boosted regression models
  • xgboost - Extreme Gradient Boosting
  • randomForest - Random Forests for Classification and Regression
  • ranger - a fast implementation of Random Forests
  • h2o - parallel distributed machine learning algorithms
  • ROCR - plots to visualize classifier performance
  • pROC - Tools for visualizing, smoothing and comparing ROC curves
  • PivotalR - R wrappers for MADlib's parallel distributed machine learning algorithms

Communicate

These packages help you communicate the results of data science to your audiences.

  • rmarkdown - easy-to-use format for reproducible reports and dynamic documents in R
  • knitr - embed R code within pdf and html reports
  • flexdashboard - easy-to-create dashboards based on rmarkdown
  • bookdown - books and long documents built on R Markdown
  • rticles - ready to use R Markdown templates
  • tufte - Tufte handout R Markdown template
  • DT - Interactive data tables
  • pixiedust - Customized tables
  • xtable - Customized tables
  • highr - Syntax Highlighting for R Source Code
  • formatR - tidy_source() to format R source code
  • yaml - Methods to convert R data to YAML and back
  • pander - renders R objects into Pandoc markdown.
  • configr - Integrated and improved configuration file parser (json,ini,yaml,toml).

Automate

These packages help you create data science products that automate your analyses.

Program

These packages make it easier to program with the R language.

  • RStudio Desktop IDE - IDE application for R
  • RStudio Server Open Source - server based IDE for R
  • RStudio Server Professional - server based IDE for R enhanced with features for business enterprises
  • devtools - tools that make it easier to develop R packages
  • packrat - creates project specific libraries, which handle package versioning and enhance reproducibility
  • drat - tools to create and use alternative R package repositories
  • testthat - easy-to-use system for unit testing packages
  • roxygen2 - easy-to-use method for documenting packages
  • purrr - tools for applying R functions to data structures
  • profvis - visualizes code profiling data from R
  • Rcpp - C++ API for R
  • R6 - fast, simple object class that uses reference semantics
  • htmltools - Tools for HTML generation and output
  • nloptr - interface to NLopt non-linear optimization library.
  • minqa - optimization algorithms.
  • rngtools - Utilities for working with Random Number Generators
  • NMF - Nonnegative Matrix Factorization
  • crayon - Adds color to terminal output
  • RJSONIO - convert R objects to JSON notation
  • jsonlite - a fast JSON parser and generator for R
  • RcppArmadillo - interface to 'Armadillo' Templated Linear Algebra Library

Data

These packages contain data sets to use as training data or toy examples.

  • babynames - Names given to US babies 1880-2014
  • neiss - sample of all accidents reported to US emergency rooms 2009-2014
  • yrbss - Youth Risk Behaviour Surveillance System data from 1991 to 2013
  • nycflights13 - all out-bound flights from NYC in 2013
  • hflights - flights departing Houston in 2011
  • USAboundaries - Historical and Contemporary Boundaries of the United States of America
  • rworldmap - country border data
  • usdanutrients - USDA nutrient database
  • fueleconomy - EPA fuel economy data
  • nasaweather - geographic and atmospheric measures on a very coarse 24 by 24 grid covering Central America
  • mexico-mortality - deaths in Mexico
  • data-movies and ggplotmovies - data from the Internet Movie Database (IMDB)
  • pop-flows - Population flows around the USA in 2008
  • data-housing-crisis - Clean data related to the 2008 US housing crisis
  • gun-sales - Statistical analysis of monthly background checks of gun purchases from NY times
  • stationaRy - hourly meteorological data from one of thousands of global stations
  • gapminder - Excerpt from the Gapminder data
  • janeaustenr - Jane Austen's Complete Novels

Criteria

What makes an R Package useful? A useful R package should perform a useful task, and it should do it well. Here are some criteria that we used to make the list.

  • The code in the package runs fast, with few errors.
  • The code in the package has an intuitive syntax that is easy to remember.
  • The package plays well with other packages; you do not need to munge your data into new forms to use the package.
  • The package is widely used and recommended by its users.
  • The package has a development website, or series of vignettes, that make the package easy to learn.
  • The package is developed in the open (e.g. on Github or RForge).
  • The package uses tests to ensure that it will be stable and bug free well into the future.
  • The package is stable and available from CRAN, or we are personally involved with the package and committed to its development.

For other useful choices, please check out our list of popular packages that did not quite meet these criteria.

You can learn more about packages in R with the CRAN task views.

More Repositories

1

cheatsheets

Posit Cheat Sheets - Can also be found at https://posit.co/resources/cheatsheets/.
TeX
5,758
star
2

shiny

Easy interactive web applications with R
R
5,341
star
3

rstudio

RStudio is an integrated development environment (IDE) for R
Java
4,432
star
4

bookdown

Authoring Books and Technical Documents with R Markdown
JavaScript
3,743
star
5

rmarkdown

Dynamic Documents for R
R
2,737
star
6

gt

Easily generate information-rich, publication-quality tables from R
R
1,985
star
7

shiny-examples

JavaScript
1,959
star
8

blogdown

Create Blogs and Websites with R Markdown
R
1,724
star
9

reticulate

R Interface to Python
R
1,656
star
10

webinars

Code and slides for RStudio webinars
HTML
1,510
star
11

rticles

LaTeX Journal Article Templates for R Markdown
TeX
1,402
star
12

plumber

Turn your R code into a web API.
R
1,390
star
13

tensorflow

TensorFlow for R
R
1,325
star
14

renv

renv: Project environments for R.
R
995
star
15

pagedown

Paginate the HTML Output of R Markdown with CSS for Print
R
883
star
16

shinydashboard

Shiny Dashboarding framework
CSS
852
star
17

pointblank

Data quality assessment and metadata reporting for data frames and database tables
R
845
star
18

keras3

R Interface to Keras
R
831
star
19

flexdashboard

Easy interactive dashboards for R
JavaScript
811
star
20

leaflet

R Interface to Leaflet Maps
JavaScript
799
star
21

rmarkdown-book

R Markdown: The Definitive Guide (published by Chapman & Hall/CRC in July 2018)
RMarkdown
738
star
22

rstudio-conf

Materials for rstudio::conf
HTML
721
star
23

shiny-server

Host Shiny applications over the web.
JavaScript
712
star
24

ggvis

Interactive grammar of graphics for R
R
709
star
25

learnr

Interactive Tutorials with R Markdown
R
709
star
26

py-shiny

Shiny for Python
Python
627
star
27

DT

R Interface to the jQuery Plug-in DataTables
JavaScript
587
star
28

rmarkdown-cookbook

R Markdown Cookbook. A range of tips and tricks to make better use of R Markdown.
RMarkdown
577
star
29

blastula

Easily send great-looking HTML email messages from R
R
541
star
30

r2d3

R Interface to D3 Visualizations
R
516
star
31

bookdown-demo

A minimal book example using bookdown
CSS
476
star
32

bslib

Tools for theming Shiny and R Markdown via Bootstrap 3, 4, or 5.
SCSS
461
star
33

hex-stickers

RStudio hex stickers
R
448
star
34

distill

Distill for R Markdown
HTML
423
star
35

packrat

Packrat is a dependency management system for R
R
394
star
36

tufte

Tufte Styles for R Markdown Documents
R
385
star
37

dygraphs

R interface to dygraphs
JavaScript
364
star
38

revealjs

R Markdown Format for reveal.js Presentations
JavaScript
316
star
39

pins-r

Pin, discover, and share resources
R
308
star
40

fontawesome

Easily insert FontAwesome icons into R Markdown docs and Shiny apps
R
294
star
41

crosstalk

Inter-htmlwidget communication for R (with and without Shiny)
JavaScript
287
star
42

tinytex-releases

Windows/macOS/Linux binaries and installation methods of TinyTeX
PowerShell
251
star
43

config

config package for R
R
247
star
44

pool

Object Pooling in R
R
244
star
45

thematic

Theme ggplot2, lattice, and base graphics based on a few simple settings.
R
242
star
46

Intro

Course materials for "Introduction to Data Science with R", a video course by RStudio and O'Reilly Media
R
234
star
47

shinytest

Automated testing for shiny apps
JavaScript
225
star
48

shinymeta

Record and expose Shiny app logic using metaprogramming
R
222
star
49

nomnoml

Sassy 'UML' Diagrams for R
JavaScript
219
star
50

httpuv

HTTP and WebSocket server package for R
C
217
star
51

shinyuieditor

A GUI for laying out a Shiny application that generates clean and human-readable UI code
JavaScript
212
star
52

htmltools

Tools for HTML generation and output
R
201
star
53

promises

A promise library for R
R
201
star
54

vetiver-r

Version, share, deploy, and monitor models
R
178
star
55

rstudioapi

Safely access RStudio's API (when available)
R
161
star
56

concept-maps

Concept maps for all things data science
HTML
161
star
57

gradethis

Tools for teachers to use with learnr
R
161
star
58

chromote

Chrome Remote Interface for R
R
155
star
59

master-the-tidyverse

Course contents for Master the Tidyverse
155
star
60

shinythemes

Themes for Shiny
R
152
star
61

ShinyDeveloperConference

Materials collected from the First Shiny Developer Conference Palo Alto, CA January 30-31 2016
HTML
152
star
62

shiny-gallery

Code and other documentation for apps in the Shiny Gallery ✨
HTML
147
star
63

sortable

R htmlwidget for Sortable.js
R
124
star
64

reactlog

Shiny Reactivity Visualizer
JavaScript
121
star
65

r-docker

Docker images for R
Dockerfile
121
star
66

rsconnect

Publish Shiny Applications, RMarkdown Documents, Jupyter Notebooks, Plumber APIs, and more
R
120
star
67

redx

dynamic nginx configuration
Lua
118
star
68

bigdataclass

Two-day workshop that covers how to use R to interact databases and Spark
R
114
star
69

r-system-requirements

System requirements for R packages
Shell
111
star
70

shinyloadtest

Tools for load testing Shiny applications
HTML
110
star
71

shinyapps

Deploy Shiny applications to ShinyApps
110
star
72

webshot2

Take screenshots of web pages from R
R
109
star
73

shinyvalidate

Input validation package for the Shiny web framework
JavaScript
108
star
74

shinytest2

R
103
star
75

miniUI

R
102
star
76

sass

Sass compiler package for R
C++
102
star
77

keras-customer-churn

Customer Churn Shiny Application
R
99
star
78

r-builds

an opinionated environment for compiling R
Shell
95
star
79

r-manuals

A re-styled version of the R manuals
R
87
star
80

addinexamples

An R package showcasing how RStudio addins can be registered and used.
R
86
star
81

shinyapps-package-dependencies

Collection of bash scripts that install R package system dependencies
R
74
star
82

markdown

The first generation of Markdown rendering for R (born in 2012). Originally based on the C library sundown. Now based on commonmark. Note that this package is markdown, not *rmarkdown*.
R
72
star
83

webdriver

WebDriver client in R
R
69
star
84

R-Websockets

HTML 5 Websockets implementation for R
R
68
star
85

beyond-dashboard-fatigue

Materials for the RStudio webinar 'Beyond Dashboard Fatigue'
R
66
star
86

cloudml

R interface to Google Cloud Machine Learning Engine
R
65
star
87

rstudio-docker-products

Docker images for RStudio Professional Products
Shell
64
star
88

shinylive

Run Shiny on Python (compiled to wasm) in the browser
TypeScript
61
star
89

rstudio-conf-2022-program

rstudio::conf(2022, "program")
R
60
star
90

bookdown.org

Source documents to generate the bookdown.org website
R
59
star
91

vetiver-python

Version, share, deploy, and monitor models.
Python
59
star
92

education.rstudio.com

CSS
57
star
93

tfestimators

R interface to TensorFlow Estimators
R
57
star
94

connections

https://rstudio.github.io/connections/
R
56
star
95

tfprobability

R interface to TensorFlow Probability
R
54
star
96

sparkDemos

HTML
53
star
97

swagger

Swagger is a collection of HTML, Javascript, and CSS assets that dynamically generate beautiful documentation from a Swagger-compliant API.
HTML
53
star
98

shiny-incubator

Examples and ideas that don't belong in the core Shiny package and aren't officially supported.
JavaScript
53
star
99

leaflet.mapboxgl

Extends the R Leaflet package with a Mapbox GL JS plugin to allow easy drawing of vector tile layers.
R
50
star
100

pins-python

Python
48
star