• Stars
    star
    512
  • Rank 86,323 (Top 2 %)
  • Language
    R
  • License
    Other
  • Created over 6 years ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Fast, sensitive and accurate integration of single-cell data with Harmony

Harmony

Travis-CI Build Status AppVeyor Build Status DOI

Fast, sensitive and accurate integration of single-cell data with Harmony

Check out the manuscript in Nature Methods:

For Python users, check out the harmonypy package by Kamil Slowikowski.

System requirements

Harmony has been tested on R versions >= 3.4. Please consult the DESCRIPTION file for more details on required R packages. Harmony has been tested on Linux, OS X, and Windows platforms.

Installation

To run Harmony, open R and install harmony from CRAN:

install.packages("harmony")

If you'd like the latest development version, install from this github directly:

library(devtools)
install_github("immunogenomics/harmony")

Usage/Demos

We made it easy to run Harmony in most common R analysis pipelines.

Quick Start

Check out this vignette for a quick start tutorial.

PCA matrix

The Harmony algorithm iteratively corrects PCA embeddings. To input your own low dimensional embeddings directly, set do_pca=FALSE. Harmony is packaged with a small dataset

library(harmony)
my_harmony_embeddings <- HarmonyMatrix(my_pca_embeddings, meta_data, "dataset", do_pca=FALSE)

Normalized gene matrix

You can also run Harmony on a sparse matrix of library size normalized expression counts. Harmony will scale these counts, run PCA, and finally perform integration.

library(harmony)
my_harmony_embeddings <- HarmonyMatrix(normalized_counts, meta_data, "dataset")

Seurat

You can run Harmony within your Seurat workflow. You'll only need to make two changes to your code.

  1. Run Harmony with the RunHarmony() function
  2. In downstream analyses, use the Harmony embeddings instead of PCA.

For example, run Harmony and then UMAP in two lines.

seuratObj <- RunHarmony(seuratObj, "dataset")
seuratObj <- RunUMAP(seuratObj, reduction = "harmony")

For details, check out these vignettes:

MUDAN

You can run Harmony with functions from the MUDAN package. For more, details, check out this vignette.

Harmony with two or more covariates

Harmony can integrate over multiple covariates. To do this, specify a vector covariates to integrate.

my_harmony_embeddings <- HarmonyMatrix(
  my_pca_embeddings, meta_data, c("dataset", "donor", "batch_id"),
  do_pca = FALSE
)

Do the same with your Seurat object:

seuratObject <- RunHarmony(seuratObject, c("dataset", "donor", "batch_id"))

Advanced

The examples above all return integrated PCA embeddings. We created a more advanced tutorial that explores the internal data structures used in the Harmony algorithm.

Reproducing results from manuscript

Code to reproduce Harmony results from the Korsunsky et al 2019 manuscript will be made available on github.com/immunogenomics/harmony2019.

More Repositories

1

presto

Fast Wilcoxon and auROC
Jupyter Notebook
148
star
2

symphony

Efficient and precise single-cell reference atlas mapping with Symphony
Jupyter Notebook
98
star
3

SCENT

Single-Cell ENhancer Target gene mapping using multimodal data with ATAC + RNA
R
65
star
4

HLA_analyses_tutorial

A thorough tutorial on HLA imputation and association, accompanying our manuscript "Tutorial: A statistical genetics guide to identifying HLA alleles driving complex disease"
Jupyter Notebook
59
star
5

LISI

Methods to compute Local Inverse Simpson's Index (LISI)
R
55
star
6

HLA-TAPAS

HLA-TAPAS pipeline for HLA association and fine-mapping studies
Jupyter Notebook
47
star
7

cna

Covarying neighborhood analysis (CNA) is a method for finding structure in- and conducting association analysis with multi-sample single-cell datasets.
Python
45
star
8

amp_phase1_ra

🌾 Zhang, et al, Nature Immunology, 2019. Use single-cell transcriptomics and proteomics to study autoimmune diseases.
Jupyter Notebook
21
star
9

IMPACT

Code for creating cell-type-specific regulatory element annotation files
R
18
star
10

scHLApers

Code to run the scHLApers pipeline for personalized single-cell HLA quantification
Jupyter Notebook
17
star
11

singlecellmethods

Collection of useful methods for single cell analysis
Jupyter Notebook
14
star
12

masc

MASC: Mixed-effects Association testing for Single Cells
R
14
star
13

cdr3-QTL

Trans-association between HLA and TCR-CDR3
HTML
14
star
14

sceQTL

Code for single-cell eQTL analysis of memory T cell data in Nathan, et al. bioRxiv (2021)
R
13
star
15

starCAT

Implements *CellAnnotator (aka *CAT/starCAT), annotating scRNA-Seq with predefined gene expression programs
Jupyter Notebook
12
star
16

goshifter

GoShifter
Python
11
star
17

TiRP

code repository for Lagattuta et al. "Repertoire analyses reveal T cell receptor sequence features that influence T cell fate." Nat. Imm. (2022)
Jupyter Notebook
11
star
18

RA_Atlas_CITEseq

RA Synovial Single-cell Multimodal Cell Atlas
Jupyter Notebook
10
star
19

cov-ldsc

Python
10
star
20

harmony2019

Reproduce analyses in Harmony Manuscript
Jupyter Notebook
9
star
21

RA_GWAS

Trans-ancestry GWAS of rheumatoid arthritis
Jupyter Notebook
7
star
22

tcrpheno

R
6
star
23

notch

Code to reproduce analyses and figures for manuscript: "Synovial fibroblast positional identity controlled by inductive Notch signaling underlies pathologic damage in inflammatory arthritis"
Jupyter Notebook
6
star
24

FibroblastAtlas2022

Jupyter Notebook
5
star
25

TCAT_analysis

Code repository for TCAT analyses
Jupyter Notebook
5
star
26

sc-h2

Jupyter Notebook
5
star
27

scpost

Implementation of the scPOST framework for simulating single-cell datasets
R
5
star
28

inflamedtissue_covid19_reference

🧬 Single-cell macrophage integration from inflammatory disease tissues and COVID-19
Jupyter Notebook
5
star
29

TB_Tcell_CITEseq

Code for analysis of memory T cell CITE-seq data
R
5
star
30

deeplearning_fan

🖥️ Implement deep learning methods in the single-cell world.
HTML
4
star
31

GeNA

Genotype-Neighborhood Associations: A tool for identifying genetic variant associations to the abundance of cell states in single-cell datasets
Jupyter Notebook
4
star
32

GPROB

Function to calculate the probability for each of multiple diseases based on a person's genetic profile
R
4
star
33

itcviewer

🌈 Innate T Cell RNA-seq data viewer.
R
4
star
34

fibroblastatlas

The Fibroblast Atlas
R
4
star
35

hla2023

Code for mapping single-cell eQTLs for HLA genes across four cohorts
Jupyter Notebook
4
star
36

cna-display

Generation of display items for CNA paper
Jupyter Notebook
3
star
37

ra_ipf_cellatlas

Visualize multiple lung phenotypes (RA-ILD, IPF, other) by single-cell
R
3
star
38

symphony_reproducibility

Jupyter Notebook
3
star
39

amp_phase1_ra_viewer

🌻 View single-cell RNA-seq and mass cytometry data in synovial tissues from patients with RA or OA.
R
3
star
40

FibroblastAtlas2021

Code to reproduce analyses and figures in 2021 Fibroblast Atlas paper
Jupyter Notebook
2
star
41

TB_progression_GWAS

Codes/data for reproducing figures presented in the TB progression manuscript
q
2
star
42

dynamicASE

R
1
star
43

RA_ATAC_multiome

Code corresponding to Weinand et al., The chromatin landscape of pathogenic transcriptional cell states in rheumatoid arthritis, Nature Communications, 2024
Jupyter Notebook
1
star
44

PAUSE_eQTL

Code for dynamic eQTL analysis of PAUSE trial
R
1
star
45

scpower

single cell power analysis
Jupyter Notebook
1
star
46

harmjan

Java
1
star
47

melody

gene correction stuff
Jupyter Notebook
1
star
48

summerofcode

Resources for Summer of Code '19
Jupyter Notebook
1
star
49

TteK

TteK (Granzyme K+ CD8 T cells): the core population of inflamed human tissue-associated CD8 T cells
Jupyter Notebook
1
star
50

GeNA-applied

Code and analyses supporting the GeNA single-cell csaQTL manuscript.
Jupyter Notebook
1
star
51

minecraft-seq

Code for analysis of MINECRAFT-seq data
Jupyter Notebook
1
star