• Stars
    star
    158
  • Rank 228,939 (Top 5 %)
  • Language
    R
  • Created over 5 years ago
  • Updated over 5 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Tools for visualizing uncertainty with ggplot2

ungeviz

Tools for visualizing uncertainty with ggplot2.

This package is meant to provide helpful add-on functionality for ggplot2 to visualize uncertainty. The package is particularly focused on hypothetical outcome plots (HOPs) and provides bootstrapping and sampling functionality that integrates well with the ggplot2 API.

The package name comes from the German word “Ungewissheit”, which means uncertainty.

Installation

devtools::install_github("wilkelab/ungeviz")

Sampling and bootstrapping

The sampler() and bootstrapper() functions generate sampling and bootstrapping objects, respectively, that can be used in place of data in ggplot2 layers. These objects are helpful when creating HOPs.

library(ggplot2)
library(dplyr)
library(forcats)
library(ungeviz)
library(gganimate)

cacao %>% filter(location %in% c("Canada", "U.S.A.")) %>%
  ggplot(aes(rating, location)) +
  geom_point(
    position = position_jitter(height = 0.3, width = 0.05), 
    size = 0.4, color = "#0072B2", alpha = 1/2
  ) +
  geom_vpline(data = sampler(25, group = location), height = 0.6, color = "#D55E00") +
  theme_bw() + 
  # `.draw` is a generated column indicating the sample draw
  transition_states(.draw, 1, 3)

Both the bootstrapper and sampler objects can be used for repeated reproducible sampling, by passing the same bootstrapper or sampler object as data to multiple ggplot2 layers.

data(BlueJays, package = "Stat2Data")

# set up bootstrapping object that generates 20 bootstraps
# and groups by variable `KnownSex`
bsr <- bootstrapper(20, KnownSex)

ggplot(BlueJays, aes(BillLength, Head, color = KnownSex)) +
  geom_smooth(method = "lm", color = NA) +
  geom_point(alpha = 0.3) +
  # `.row` is a generated column providing a unique row number for all rows
  geom_point(data = bsr, aes(group = .row)) +
  geom_smooth(data = bsr, method = "lm", fullrange = TRUE, se = FALSE) +
  facet_wrap(~KnownSex, scales = "free_x") +
  scale_color_manual(values = c(F = "#D55E00", M = "#0072B2"), guide = "none") +
  theme_bw() +
  transition_states(.draw, 1, 1) + 
  enter_fade() + exit_fade()

Smooth draws

Instead of bootstrapping smoothers or regression lines, we can also fit a smoothing model to the data and then generate fit lines by randomly drawing from the posterior distribution. This strategy is automated in stat_smooth_draws(), which works similar to stat_smooth() but generates multiple equally probable fit draws rather than one best-fit line.

ggplot(mtcars, aes(hp, mpg)) + 
  geom_point() +
  stat_smooth_draws(times = 20, aes(group = stat(.draw))) + 
  theme_bw() +
  transition_states(stat(.draw), 1, 2) +
  enter_fade() + exit_fade()

Miscellaneous geoms and stats

Several geoms and stats are provided that can be helpful when visualizing uncertainty, including geom_hpline() and geom_vpline() used in the sampling example above, and stat_confidence_density() which can draw confidence strips.

library(broom)
library(emmeans)

cacao_lumped <- cacao %>%
  mutate(
    location = fct_lump(location, n = 20)
  )
  
cacao_means <- lm(rating ~ location, data = cacao_lumped) %>%
  emmeans("location") %>%
  tidy() %>%
  mutate(location = fct_reorder(location, estimate))

ggplot(cacao_means, aes(x = estimate, moe = std.error, y = location)) +
  stat_confidence_density(fill = "lightblue", height = 0.8, confidence = 0.68) +
  geom_point(aes(x = estimate), size = 2) +
  geom_errorbarh(aes(xmin = estimate - std.error, xmax = estimate + std.error), height = 0.5) +
  xlim(2.6, 3.7) +
  theme_minimal()

More Repositories

1

cowplot

cowplot: Streamlined Plot Theme and Plot Annotations for ggplot2
R
678
star
2

ggtext

Improved text rendering support for ggplot2
R
643
star
3

ggridges

Ridgeline plots in ggplot2
R
404
star
4

SDS375

SDS 375 Data Visualization in R
HTML
282
star
5

gridtext

Improved text rendering support for grid graphics in R
R
95
star
6

practicalgg

Practical ggplot2
R
73
star
7

Opfi

A Python package for discovery, annotation, and analysis of gene clusters in genomics or metagenomics data sets.
Python
21
star
8

ProteinEvolutionToolbox

Python
14
star
9

DSC385

HTML
14
star
10

Metagenomics_CAST

Metagenomic search for novel CRISPR-transposons
C++
10
star
11

influenza_HA_evolution

Brainfuck
8
star
12

sicegar

R package for the analysis of single cell virology growth curves
R
8
star
13

wilkelab.github.io.archive_sep_2020

Source for wilkelab.org website
HTML
7
star
14

cinful

A fully automated pipeline to identify microcins along with their associated immunity proteins and export machinery
Pep8
6
star
15

influenza_pH1N1_timecourse

HyPhy
5
star
16

dataviz_shortcourse

Materials for a dataviz shortcourse
HTML
4
star
17

r4s_benchmark

Shell
3
star
18

EBOV_H1N1

TeX
3
star
19

cinful_data_analysis

A repository for the analysis involved in "Evidence for widespread class II microcins in Enterobacterales genomes"
Jupyter Notebook
3
star
20

wilkelab.github.io

HTML
2
star
21

influenza_H3N2_passaging

Data and code for McWhite et al., Sequence amplification via cell passaging creates spurious signals of positive adaptation in influenza virus H3N2 hemagglutinin, Virus Evolution (2016)
HyPhy
2
star
22

complex_divergence_simul

Code and data for PPI simulation part of Kachroo et al., Systematic humanization of yeast genes reveals conserved functions and genetic modularity. Science 348:921–925, 2015
R
2
star
23

eGFP_deletion_prediction

R
1
star
24

rate_variability_variation

Code and data for Jackson et al., Intermediate divergence levels maximize the strength of structure–sequence correlations in enzymes and viral proteins, Protein Science, in press.
Python
1
star
25

MACV_TfR1_modeling

R
1
star
26

influenza_codon_usage

Data and code for Smith et al., Avian Influenza Virus PB1 Gene in H3N2 Viruses Evolved in Humans To Reduce Interferon Inhibition by Skewing Codon Usage toward Interferon-Altered tRNA Pools, mBio 2018
Python
1
star
27

therm_constraints_rate_variation

Code and data for Echave et al., Relationship between protein thermodynamic constraints and variation of evolutionary rates among sites, Phys. Biol. 12:025002, 2015.
Python
1
star