• Stars
    star
    600
  • Rank 74,640 (Top 2 %)
  • Language
    C++
  • Created over 9 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A single molecule sequence assembler for genomes large and small.

Canu

Canu is a fork of the Celera Assembler, designed for high-noise single-molecule sequencing (such as the PacBio RS II/Sequel or Oxford Nanopore MinION).

Canu is a hierarchical assembly pipeline which runs in four steps:

  • Detect overlaps in high-noise sequences using MHAP
  • Generate corrected sequence consensus
  • Trim corrected sequences
  • Assemble trimmed corrected sequences

Install:

  • Do NOT download the .zip source code. It is missing files and will not compile. This is a known flaw with git itself.

  • The easiest way to get started is to download a binary release.

  • Installing with a 'package manager' is not encouraged, but if you have no other choice:

    • Conda: conda install -c conda-forge -c bioconda -c defaults canu
    • Homebrew: brew install brewsci/bio/canu
  • Alternatively, you can use the latest unreleased version from the source code. This version has not undergone the same testing as a release and so may have unknown bugs or issues generating sub-optimal assemblies. We recommend the release version for most users.

      git clone https://github.com/marbl/canu.git
      cd canu/src
      make -j <number of threads>
    
  • An unsupported Docker image made by Frank FΓΆrster is at https://hub.docker.com/r/greatfireball/canu/.

Learn:

The quick start will get you assembling quickly, while the tutorial explains things in more detail.

Run:

Brief command line help:

../<architecture>/bin/canu

Full list of parameters:

../<architecture>/bin/canu -options

Citation:

More Repositories

1

CHM13

The complete sequence of a human genome
914
star
2

Krona

Interactively explore metagenomes and more from a web browser.
JavaScript
419
star
3

Mash

Fast genome and metagenome distance estimation using MinHash
C++
355
star
4

verkko

Telomere-to-telomere assembly of accurate long reads (PacBio HiFi, Oxford Nanopore Duplex, HERRO corrected Oxford Nanopore Simplex) and Oxford Nanopore ultra-long reads.
Python
274
star
5

MashMap

A fast approximate aligner for long DNA sequences
C++
210
star
6

merqury

k-mer based assembly evaluation
Shell
201
star
7

Winnowmap

Long read / genome alignment software
C
187
star
8

SALSA

SALSA: A tool to scaffold long read assemblies with Hi-C data
Python
178
star
9

parsnp

Parsnp was designed to align the core genome of hundreds to thousands of bacterial genomes within a few minutes to few hours. Input can be both draft assemblies and finished genomes, and output includes variant (SNP) calls, core genome phylogeny and multi-alignments. Parsnp leverages contextual information provided by multi-alignments surrounding SNP sites for filtration/cleaning, in addition to existing tools for recombination detection/filtration and phylogenetic reconstruction.
C++
124
star
10

ModDotPlot

Python
102
star
11

MHAP

MinHash Alignment Process (MHAP, pronounced MAP): locality-sensitive hashing to detect long-read overlaps and utilities
Java
95
star
12

HG002

A complete diploid human genome
94
star
13

metAMOS

A metagenomic and isolate assembly and analysis pipeline built with AMOS
Roff
93
star
14

meryl

A genomic k-mer counter (and sequence utility) with nice features.
C
78
star
15

harvest

50
star
16

MetaCompass

MetaCompass: Reference-guided Assembly of Metagenomes
Python
38
star
17

Primates

Complete assemblies of non-human primate genomes
38
star
18

MetagenomeScope

Visualization tool for (meta)genome assembly graphs
JavaScript
25
star
19

seqrequester

A tool for summarizing, extracting, generating and modifying DNA sequences.
C
23
star
20

rukki

Extracting paths from assembly graphs
Rust
22
star
21

CHM13-issues

CHM13 human reference genome issue tracking
HTML
18
star
22

T2T-Browser

Genome browser hub for the T2T genomes and resources
HTML
15
star
23

VALET

A pipeline for detecting mis-assemblies in metagenomic assemblies.
TeX
14
star
24

gingr

C++
13
star
25

MetaCarvel

MetaCarvel: A scaffolder for metagenomes
C++
13
star
26

MUMmer3

MUMmer3
C++
11
star
27

binnacle

Binnacle: Using Scaffolds to Improve the Contiguity and Quality of Metagenomic Bins
Python
10
star
28

HG002-issues

HG002 human reference genome issue tracking and polishing
10
star
29

harvest-tools

C++
8
star
30

ATLAS

outlier detection in BLAST hits
Python
3
star