• Stars
    star
    784
  • Rank 58,032 (Top 2 %)
  • Language
    R
  • License
    Creative Commons ...
  • Created over 4 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A great intro dataset for data exploration & visualization (alternative to iris).

palmerpenguins

DOI CRAN

The goal of palmerpenguins is to provide a great dataset for data exploration & visualization, as an alternative to iris.

Installation

You can install the released version of palmerpenguins from CRAN with:

install.packages("palmerpenguins")

To install the development version from GitHub use:

# install.packages("remotes")
remotes::install_github("allisonhorst/palmerpenguins")

About the data

Data were collected and made available by Dr. Kristen Gorman and the Palmer Station, Antarctica LTER, a member of the Long Term Ecological Research Network.

The palmerpenguins package contains two datasets.

library(palmerpenguins)
data(package = 'palmerpenguins')

One is called penguins, and is a simplified version of the raw data; see ?penguins for more info:

head(penguins)
#> # A tibble: 6 × 8
#>   species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g sex  
#>   <fct>   <fct>           <dbl>         <dbl>            <int>       <int> <fct>
#> 1 Adelie  Torge…           39.1          18.7              181        3750 male 
#> 2 Adelie  Torge…           39.5          17.4              186        3800 fema…
#> 3 Adelie  Torge…           40.3          18                195        3250 fema…
#> 4 Adelie  Torge…           NA            NA                 NA          NA <NA> 
#> 5 Adelie  Torge…           36.7          19.3              193        3450 fema…
#> 6 Adelie  Torge…           39.3          20.6              190        3650 male 
#> # … with 1 more variable: year <int>

The second dataset is penguins_raw, and contains all the variables and original names as downloaded; see ?penguins_raw for more info.

head(penguins_raw)
#> # A tibble: 6 × 17
#>   studyName `Sample Number` Species          Region Island Stage `Individual ID`
#>   <chr>               <dbl> <chr>            <chr>  <chr>  <chr> <chr>          
#> 1 PAL0708                 1 Adelie Penguin … Anvers Torge… Adul… N1A1           
#> 2 PAL0708                 2 Adelie Penguin … Anvers Torge… Adul… N1A2           
#> 3 PAL0708                 3 Adelie Penguin … Anvers Torge… Adul… N2A1           
#> 4 PAL0708                 4 Adelie Penguin … Anvers Torge… Adul… N2A2           
#> 5 PAL0708                 5 Adelie Penguin … Anvers Torge… Adul… N3A1           
#> 6 PAL0708                 6 Adelie Penguin … Anvers Torge… Adul… N3A2           
#> # … with 10 more variables: `Clutch Completion` <chr>, `Date Egg` <date>,
#> #   `Culmen Length (mm)` <dbl>, `Culmen Depth (mm)` <dbl>,
#> #   `Flipper Length (mm)` <dbl>, `Body Mass (g)` <dbl>, Sex <chr>,
#> #   `Delta 15 N (o/oo)` <dbl>, `Delta 13 C (o/oo)` <dbl>, Comments <chr>

Both datasets contain data for 344 penguins. There are 3 different species of penguins in this dataset, collected from 3 islands in the Palmer Archipelago, Antarctica.

str(penguins)
#> tibble [344 × 8] (S3: tbl_df/tbl/data.frame)
#>  $ species          : Factor w/ 3 levels "Adelie","Chinstrap",..: 1 1 1 1 1 1 1 1 1 1 ...
#>  $ island           : Factor w/ 3 levels "Biscoe","Dream",..: 3 3 3 3 3 3 3 3 3 3 ...
#>  $ bill_length_mm   : num [1:344] 39.1 39.5 40.3 NA 36.7 39.3 38.9 39.2 34.1 42 ...
#>  $ bill_depth_mm    : num [1:344] 18.7 17.4 18 NA 19.3 20.6 17.8 19.6 18.1 20.2 ...
#>  $ flipper_length_mm: int [1:344] 181 186 195 NA 193 190 181 195 193 190 ...
#>  $ body_mass_g      : int [1:344] 3750 3800 3250 NA 3450 3650 3625 4675 3475 4250 ...
#>  $ sex              : Factor w/ 2 levels "female","male": 2 1 1 NA 1 2 1 2 NA NA ...
#>  $ year             : int [1:344] 2007 2007 2007 2007 2007 2007 2007 2007 2007 2007 ...

We gratefully acknowledge Palmer Station LTER and the US LTER Network. Special thanks to Marty Downs (Director, LTER Network Office) for help regarding the data license & use.

Examples

You can find these and more code examples for exploring palmerpenguins in vignette("examples").

Penguins are fun to summarize! For example:

library(tidyverse)
penguins %>% 
  count(species)
#> # A tibble: 3 × 2
#>   species       n
#>   <fct>     <int>
#> 1 Adelie      152
#> 2 Chinstrap    68
#> 3 Gentoo      124
penguins %>% 
  group_by(species) %>% 
  summarize(across(where(is.numeric), mean, na.rm = TRUE))
#> # A tibble: 3 × 6
#>   species   bill_length_mm bill_depth_mm flipper_length_mm body_mass_g  year
#>   <fct>              <dbl>         <dbl>             <dbl>       <dbl> <dbl>
#> 1 Adelie              38.8          18.3              190.       3701. 2008.
#> 2 Chinstrap           48.8          18.4              196.       3733. 2008.
#> 3 Gentoo              47.5          15.0              217.       5076. 2008.

Penguins are fun to visualize! For example:

Artwork

You can download palmerpenguins art (useful for teaching with the data) in vignette("art"). If you use this artwork, please cite with: “Artwork by @allison_horst”.

Meet the Palmer penguins

Bill dimensions

The culmen is the upper ridge of a bird’s bill. In the simplified penguins data, culmen length and depth are renamed as variables bill_length_mm and bill_depth_mm to be more intuitive.

For this penguin data, the culmen (bill) length and depth are measured as shown below (thanks Kristen Gorman for clarifying!):

License

Data are available by CC-0 license in accordance with the Palmer Station LTER Data Policy and the LTER Data Access Policy for Type I data.

Citation

To cite the palmerpenguins package, please use:

citation("palmerpenguins")
#> 
#> To cite palmerpenguins in publications use:
#> 
#>   Horst AM, Hill AP, Gorman KB (2020). palmerpenguins: Palmer
#>   Archipelago (Antarctica) penguin data. R package version 0.1.0.
#>   https://allisonhorst.github.io/palmerpenguins/. doi:
#>   10.5281/zenodo.3960218.
#> 
#> A BibTeX entry for LaTeX users is
#> 
#>   @Manual{,
#>     title = {palmerpenguins: Palmer Archipelago (Antarctica) penguin data},
#>     author = {Allison Marie Horst and Alison Presmanes Hill and Kristen B Gorman},
#>     year = {2020},
#>     note = {R package version 0.1.0},
#>     doi = {10.5281/zenodo.3960218},
#>     url = {https://allisonhorst.github.io/palmerpenguins/},
#>   }

Additional data use information

Anyone interested in publishing the data should contact Dr. Kristen Gorman about analysis and working together on any final products. From Gorman et al. (2014): “Individuals interested in using these data are expected to follow the US LTER Network’s Data Access Policy, Requirements and Use Agreement: https://lternet.edu/data-access-policy/.”

References

Data originally published in:

  • Gorman KB, Williams TD, Fraser WR (2014). Ecological sexual dimorphism and environmental variability within a community of Antarctic penguins (genus Pygoscelis). PLoS ONE 9(3):e90081. https://doi.org/10.1371/journal.pone.0090081

Data citations:

Adélie penguins:

  • Palmer Station Antarctica LTER and K. Gorman, 2020. Structural size measurements and isotopic signatures of foraging among adult male and female Adélie penguins (Pygoscelis adeliae) nesting along the Palmer Archipelago near Palmer Station, 2007-2009 ver 5. Environmental Data Initiative. https://doi.org/10.6073/pasta/98b16d7d563f265cb52372c8ca99e60f (Accessed 2020-06-08).

Gentoo penguins:

  • Palmer Station Antarctica LTER and K. Gorman, 2020. Structural size measurements and isotopic signatures of foraging among adult male and female Gentoo penguin (Pygoscelis papua) nesting along the Palmer Archipelago near Palmer Station, 2007-2009 ver 5. Environmental Data Initiative. https://doi.org/10.6073/pasta/7fca67fb28d56ee2ffa3d9370ebda689 (Accessed 2020-06-08).

Chinstrap penguins:

  • Palmer Station Antarctica LTER and K. Gorman, 2020. Structural size measurements and isotopic signatures of foraging among adult male and female Chinstrap penguin (Pygoscelis antarcticus) nesting along the Palmer Archipelago near Palmer Station, 2007-2009 ver 6. Environmental Data Initiative. https://doi.org/10.6073/pasta/c14dfcfada8ea13a17536e73eb6fbe9e (Accessed 2020-06-08).

More Repositories

1

stats-illustrations

R & stats illustrations by @allison_horst
1,783
star
2

allison-tidy-tuesdays

HTML
34
star
3

esm-244-2019

ESM 244 Course Materials Winter 2019
18
star
4

dplyr-learnr

A colorful introduction to some common functions in dplyr, part of the tidyverse.
HTML
18
star
5

explore-na

Exploring NAs in a small rodent dataset
HTML
15
star
6

esm-206-2018

Bren School ESM 206 (Intro Data Analysis & Stats) materials
R
13
star
7

EDS_221_programming-essentials

HTML
11
star
8

gt-awesome-tables

Tiny gt examples
HTML
10
star
9

EDS_212_essential-math

HTML
9
star
10

esm206-f2021-labs

Compiled repo of ESM 206 labs (Fall 2021)
HTML
7
star
11

esm244-w2020-lab8

Text mining, sentiment analysis, and visualization
HTML
5
star
12

bren-phd-workshop-r-materials

Materials from days 4 - 7 of the Bren Incoming PhD Quant Workshop (intro to R/RStudio)
HTML
5
star
13

esm-206-labs-2019

Compiled lab materials for ESM 206 (Fall 2019)
HTML
5
star
14

qmss-r-teaching-tools

A teaching tools that can help you incorporate R into your teaching (especially with beginning coders).
HTML
5
star
15

missingdataexercises

What the Package Does (One Line, Title Case)
CSS
4
star
16

bren-alumni-r-workshop-2019

R/RStudio refresher workshop for Bren School alumni (August 2019)
HTML
4
star
17

meds-distill-template

CSS
4
star
18

allisonhorst.github.io

HTML
4
star
19

shiny-basics-sb-r-ladies

SB R-Ladies session (2019-10-30): Basics of building Shiny apps in R
R
4
star
20

penguins_paper_distill

Paper accepted to the R Journal
HTML
4
star
21

data-vis

Data visualization with ggplot2
HTML
4
star
22

allisonhorst

3
star
23

flexdashboard_example

HTML
3
star
24

rladies-bangalore-rmarkdown

HTML
3
star
25

lotka-volterra-example

3
star
26

meet-github-esm-206

ESM 206 lab - meet GitHub & work between RStudio & GitHub
HTML
3
star
27

eds-ggplot2-gganimate

eco-data-sci workshop (January 2019) - basic animated graphics with ggplot2 + gganimate
HTML
3
star
28

quarto-donuts

JavaScript
2
star
29

eds221-day1-session2-delete

2
star
30

bren-206-244-github-test

Test repo for 206/244 students to ensure successful connection between github/RStudio
2
star
31

esm244-w2020-lab5

Lab week 5: exploring time series data, intro to forecasting
HTML
2
star
32

esm244-w2020-lab1

Lab 1 materials: ESM 244 (Winter 2020)
HTML
2
star
33

esm-244-lab-1

Wrangling review, a map with sf, and an introduction to Shiny apps
HTML
2
star
34

bren-phd-workshop-2020

Bren School Intro to Data Science with R/RStudio Workshop (Fall 2020)
2
star
35

esm-244-lab-6

Intro to spatial data in R; sf, tmap, geom_sf, leaflet
HTML
2
star
36

customer-dashboard

Demo Observable Framework dashboard
JavaScript
2
star
37

eds221-day8-comp

Day 8 Computational Sessions - Data Viz with ggplot2
HTML
2
star
38

taylor-swift-concerts

JavaScript
2
star
39

esm244-w2020-lab7

Lab Week 7 materials: point pattern analysis, G and L functions, spatial density, k-means clustering intro
HTML
2
star
40

sf-wind-framework

An example of R & Python data loaders
JavaScript
2
star
41

learnr-intro

Brief intro to creating a learnr tutorial
HTML
2
star
42

penguins-regression-compare

JavaScript
1
star
43

branch-practice

1
star
44

esm244-w2021-parameterized-reports

R
1
star
45

marCasadoPCA

JavaScript
1
star
46

logicals

1
star
47

esm244_discussion_7

1
star
48

quarto_demo

JavaScript
1
star
49

glaciers

HTML
1
star
50

esm-244-week-4-materials

Take-home materials to complete for ESM 244 Week 4
1
star
51

esm244-w2020-lab6

Visualizing spatial data continued, ordinary kriging example
HTML
1
star
52

esm-244-shiny-examples

Some shiny examples presented in 244
R
1
star
53

birdattacks

JavaScript
1
star
54

esm244-w2020-lab9

HTML
1
star
55

ucsb-qmss-rmarkdown

February 2021 UCSB QMSS Workshop - Level up in RMarkdown
HTML
1
star
56

esm244-w2020-lab2

Lab 2 materials
HTML
1
star
57

allison.rbind.io

HTML
1
star
58

ten-examples-each

Ten self-contained examples of frequently used tidyverse functions (for my reference)
HTML
1
star
59

r-data-loaders

Starting a collection of R data loaders in Framework projects
TypeScript
1
star
60

r-ladies-tunis-2021

Materials for R-Ladies Tunis Workshop (February 2021)
HTML
1
star
61

esm244-w2020-lab10

HTML
1
star
62

data-loader-examples

A collection of easy to explore and reuse data loader examples in Observable Framework projects.
JavaScript
1
star
63

rvest-example

HTML
1
star
64

esm-244-lab-2

ESM 244 Lab Week: Ordinal/multinomial logistic regression, gganimate
HTML
1
star
65

medsRmd

An R Markdown template for MEDS documents
HTML
1
star
66

EDS_411AB_meds-capstone

Course website - MEDS Capstone Course (Winter)
CSS
1
star
67

eds212-sir-ode

Basic SIR model in R example
HTML
1
star
68

esm244-w2020-lab3

binary logistic regression, (re)-introduction to spatial data with `sf`
HTML
1
star