• Stars
    star
    124
  • Rank 288,207 (Top 6 %)
  • Language
    C++
  • License
    Other
  • Created over 10 years ago
  • Updated 5 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Parsnp was designed to align the core genome of hundreds to thousands of bacterial genomes within a few minutes to few hours. Input can be both draft assemblies and finished genomes, and output includes variant (SNP) calls, core genome phylogeny and multi-alignments. Parsnp leverages contextual information provided by multi-alignments surrounding SNP sites for filtration/cleaning, in addition to existing tools for recombination detection/filtration and phylogenetic reconstruction.

Parsnp is a command-line-tool for efficient microbial core genome alignment and SNP detection. Parsnp was designed to work in tandem with Gingr, a flexible platform for visualizing genome alignments and phylogenetic trees; both Parsnp and Gingr form part of the Harvest suite :

Installation

From conda

Parsnp is available on the Bioconda channel. This is the recommended method of installation. Once you have added the Bioconda channel to your conda environment, parsnp can be installed via

conda install parsnp

From source

To build Parsnp from source, users must have automake 1.15, autoconf, and libtool installed. Parsnp also requires RaxML, Phipack, Harvest-tools, and numpy. Some additional features require Mash, FastANI and FastTree. All of these packages are available via Conda (many on the Bioconda channel).

Build instructions

First, you must build the Muscle library

cd muscle
./autogen.sh
./configure --prefix=$PWD CXXFLAGS='-fopenmp'
make install

Now we can build Parsnp

cd ..
./autogen.sh
./configure
make LDADD=-lMUSCLE-3.7 
make install

If you wish to be able to move your Parsnp installation around after building, build the parsp binary as follows (after building the Muscle library)

./autogen.sh
export ORIGIN=\$ORIGIN
./configure LDFLAGS='-Wl,-rpath,$$ORIGIN/../muscle/lib'
make LDADD=-lMUSCLE-3.7 
make install

Note that the parsnp executable in bin/ is not the same as the one in the root level. The former is an alias for Parsnp.py while the latter is the core algorithm of Parsnp that we build above.

OSX Users (Catalina)

Recent OSX have a Gatekeeper, that's designed to ensure that only softwre from known developers runs on tour Mac. Please refer to this link to enable the binaries shipped with Parsnp to run: https://support.apple.com/en-us/HT202491

Running Parsnp

Parsnp can be run multiple ways, but the most common is with a set of genomes and a reference.

parsnp -g <reference_genbank> -d <genomes> 
parsnp -r <reference_fasta> -d <genomes> 

For example,

./parsnp -g examples/mers_virus/ref/England1.gbk -d examples/mers_virus/genomes/*.fna -c

More examples can be found in the readthedocs tutorial

Misc

CITATION provides details on how to cite Parsnp.

LICENSE provides licensing information.

More Repositories

1

CHM13

The complete sequence of a human genome
914
star
2

canu

A single molecule sequence assembler for genomes large and small.
C++
600
star
3

Krona

Interactively explore metagenomes and more from a web browser.
JavaScript
419
star
4

Mash

Fast genome and metagenome distance estimation using MinHash
C++
355
star
5

verkko

Telomere-to-telomere assembly of accurate long reads (PacBio HiFi, Oxford Nanopore Duplex, HERRO corrected Oxford Nanopore Simplex) and Oxford Nanopore ultra-long reads.
Python
274
star
6

MashMap

A fast approximate aligner for long DNA sequences
C++
210
star
7

merqury

k-mer based assembly evaluation
Shell
201
star
8

Winnowmap

Long read / genome alignment software
C
187
star
9

SALSA

SALSA: A tool to scaffold long read assemblies with Hi-C data
Python
178
star
10

ModDotPlot

Python
102
star
11

MHAP

MinHash Alignment Process (MHAP, pronounced MAP): locality-sensitive hashing to detect long-read overlaps and utilities
Java
95
star
12

HG002

A complete diploid human genome
94
star
13

metAMOS

A metagenomic and isolate assembly and analysis pipeline built with AMOS
Roff
93
star
14

meryl

A genomic k-mer counter (and sequence utility) with nice features.
C
78
star
15

harvest

50
star
16

MetaCompass

MetaCompass: Reference-guided Assembly of Metagenomes
Python
38
star
17

Primates

Complete assemblies of non-human primate genomes
38
star
18

MetagenomeScope

Visualization tool for (meta)genome assembly graphs
JavaScript
25
star
19

seqrequester

A tool for summarizing, extracting, generating and modifying DNA sequences.
C
23
star
20

rukki

Extracting paths from assembly graphs
Rust
22
star
21

CHM13-issues

CHM13 human reference genome issue tracking
HTML
18
star
22

T2T-Browser

Genome browser hub for the T2T genomes and resources
HTML
15
star
23

VALET

A pipeline for detecting mis-assemblies in metagenomic assemblies.
TeX
14
star
24

gingr

C++
13
star
25

MetaCarvel

MetaCarvel: A scaffolder for metagenomes
C++
13
star
26

MUMmer3

MUMmer3
C++
11
star
27

binnacle

Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins
Python
10
star
28

HG002-issues

HG002 human reference genome issue tracking and polishing
10
star
29

harvest-tools

C++
8
star
30

ATLAS

outlier detection in BLAST hits
Python
3
star