• Stars
    star
    378
  • Rank 113,272 (Top 3 %)
  • Language
    Python
  • Created over 8 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Customizable workflows based on snakemake and python for the analysis of NGS data

snakePipes

Documentation Status Build Staus Citation

snakePipes are flexible and powerful workflows built using Snakemake that simplify the analysis of NGS data.

./docs/content/images/snakePipes_small.png

Workflows available

  • DNA-mapping*
  • ChIP-seq*
  • mRNA-seq*
  • noncoding-RNA-seq*
  • ATAC-seq*
  • scRNA-seq
  • Hi-C
  • Whole Genome Bisulfite Seq/WGBS

(*Also available in "allele-specific" mode)

Installation

Snakepipes is a set of Snakemake workflows which use conda for installation and dependency resolution, so you will need to install conda first.

Afterward, simply run the following:

conda install mamba -c conda-forge && mamba create -n snakePipes -c mpi-ie -c bioconda -c conda-forge snakePipes

This will create a new conda environment called "snakePipes" into which snakePipes is installed. You will then need to create the conda environments needed by the various workflows. To facilitate this we provide the snakePipes commands:

  • conda activate snakePipes to activate the appropriate conda environment.
  • snakePipes createEnvs to create the various environments.

Indices and annotations needed to run the workflows could be created by a simple command :

createIndices --genomeURL <path/url to your genome fasta> --gtfURL <path/url to genes.gtf> -o <output_dir> <name>

where name refers to the name/id of your genome (specify as you wish).

A few additional steps you can then take:

1. Modify/remove/add the organism yaml files appropriately : these yaml files would contain location of appropriate GTF files and genome indexes corresponding to different organisms. The location of these files after installation can be found using snakePipes info command.

2. Modify the cluster.yaml file appropriately : This yaml file contains settings for your cluster scheduler (SGE/slurm). Location revealed using snakePipes info command.

Documentation

For detailed documentation on setup and usage, please visit our read the docs page.

Citation

If you adopt/run snakePipes for your analysis, cite it as follows :

Bhardwaj, Vivek, Steffen Heyne, Katarzyna Sikora, Leily Rabbani, Michael Rauer, Fabian Kilpert, Andreas S. Richter, Devon P. Ryan, and Thomas Manke. 2019. β€œsnakePipes: Facilitating Flexible, Scalable and Integrative Epigenomic Analysis.” Bioinformatics , May. doi:10.1093/bioinformatics/btz436

Note

SnakePipes are under active development. We appreciate your help in improving it further. Please use issues to the GitHub repository for feature requests or bug reports.

More Repositories

1

parkour

Moved to: https://github.com/maxplanck-ie/parkour2
JavaScript
33
star
2

HiCAssembler

Software to assemble contigs/scaffolds into chromosomes using Hi-C data
Python
27
star
3

parkour2

Parkour2: Laboratory Information Management System πŸ‘©πŸ»β€πŸ”¬ 🧬 πŸ‘¨πŸ½β€πŸ’»
Python
25
star
4

rna-seq-qc

Rna-seq pipeline, From FASTQ to differential expression analysis...
Python
20
star
5

10X_snakepipe

A snakemake pipeline for 10X genomics cellranger
Python
19
star
6

TheWhoTheWhatTheHuh

"The who the what the huh?" is our pipeline for converting bcl files to fastq and performing QC.
Python
11
star
7

cookiecutter-bioinformatics-project

A cookiecutter template for bioinformatics projects, inspired by cookiecutter-data-science and Snakemake Workflows.
Python
7
star
8

data_repository

Makefiles and all associated scripts needed to completely remake /data/repository on all linux desktops
Makefile
4
star
9

Rintro

material for R-Course
R
4
star
10

Rseurat

Single Cell RNA-seq Course
R
3
star
11

snakequest

A shiny app that collects user input required for running snakepipes.
R
3
star
12

ultraheatmap

extra pumped heatmaps
Python
3
star
13

sc-VirusScan

A Snakemake pipeline for identifying viruses from single-cell data.
Python
2
star
14

ATACofthesnake

Differential accessibility calculation for bulk-ATAC seq.
Python
2
star
15

Misc

Miscellaneous file that should probably be version controlled, but don't really need their own repositories.
Python
2
star
16

docker-blast

wwwblast in a docker container
Dockerfile
1
star
17

TRAP

tools and utilities for annotating DNA sequences
Perl
1
star
18

Rdeseq2

Material for the advanced R course
R
1
star
19

Genes2Functions

This is a shiny app built around clusterProfiler to perform functional enrichment analysis from RNA-Seq DE analysis
HTML
1
star
20

dissectBCL

demultiplexing pipeline
Python
1
star