• Stars
    star
    201
  • Rank 194,441 (Top 4 %)
  • Language
    C++
  • License
    MIT License
  • Created over 5 years ago
  • Updated 12 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

De novo genome assembler for long uncorrected reads

Raven

Latest GitHub release Build status for gcc/clang Published in Nature Computational Science

Raven is a de novo genome assembler for long uncorrected reads.

Usage

To build raven executable run the following commands:

git clone https://github.com/lbcb-sci/raven && cd raven
cmake -S ./ -B./build -DRAVEN_BUILD_EXE=1 -DCMAKE_BUILD_TYPE=Release
cmake --build build

For faster build times optionally use ninja and enable threading in cmake. Eg.

git clone https://github.com/lbcb-sci/raven && cd raven
cmake -S ./ -B ./build -DRAVEN_BUILD_EXE=1 -DCMAKE_BUILD_TYPE=Release -G Ninja
cmake --build build -j 4

To install the raven executable after build run:

cmake --install ./build

To install python bindings run the following:

pip install git+git://github.com/lbcb-sci/raven.git@master

Python example can be found at PythonLib/example.py

usage: raven [options ...] <sequences> [<sequences> ...]

  # default output is to stdout in FASTA format
  <sequences>
    input file in FASTA/FASTQ format (can be compressed with gzip)

  options:
    -k, --kmer-len <int>
      default: 15
      length of minimizers used to find overlaps
    -w, --window-len <int>
      default: 5
      length of sliding window from which minimizers are sampled
    -f, --frequency <double>
      default: 0.001
      threshold for ignoring most frequent minimizers
    -i, --identity <double>
      default: 0
      threshold for overlap between two reads in order to construct an edge between them
      if set to zero, this functionality is disabled
    -o, --kMaxNumOverlaps <long unsigned int>
      default: 32
      maximum number of overlaps that will be taken during FindOverlapsAndCreatePiles stage
    -p, --polishing-rounds <int>
      default: 2
      number of times racon is invoked
    -m, --match <int>
      default: 3
      score for matching bases
    -n, --mismatch <int>
      default: -5
      score for mismatching bases
    -g, --gap <int>
      default: -4
      gap penalty (must be negative)
    -u, --min-unitig-size <int>
      default: 9999
      minimal unitig size
    --graphical-fragment-assembly <string>
      prints the assembly graph in GFA format
    --resume
      resume previous run from last checkpoint
    --disable-checkpoints
      disable checkpoint file creation
    -t, --threads <int>
      default: 1
      number of threads
    --version
      prints the version number
    -h, --help
      prints the usage

  only available when built with CUDA:
    -c, --cuda-poa-batches <int>
      default: 0
      number of batches for CUDA accelerated polishing
    -b, --cuda-banded-alignment
      use banding approximation for polishing on GPU
      (only applicable when -c is used)
    -a, --cuda-alignment-batches <int>
      default: 0
      number of batches for CUDA accelerated alignment

To use raven library component in your project, add the following to your cmake file:

include(FetchContent)

FetchContent_Declare(
        raven
        GIT_REPOSITORY https://github.com/lbcb-sci/raven
        GIT_TAG v1.8.1)

FetchContent_GetProperties(raven)
if (NOT raven_POPULATED)
    FetchContent_Populate(raven)
    add_subdirectory(
            ${raven_SOURCE_DIR}
            ${raven_BINARY_DIR}
            EXCLUDE_FROM_ALL)
endif ()

target_link_libraries(<YourTarget> <PRIVATE|PUBLIC|INTERFACE> raven)

Build options

  • RAVEN_BUILD_TESTS: build unit tests
  • RAVEN_BUILD_PYTHON: builds python module
  • RAVEN_BUILD_SHARED_LIBS: build raven lib and it's dependencies as shared libraries
  • RAVEN_BUILD_EXE: build raven executable
  • racon_enable_cuda: build with NVIDIA CUDA support

Dependencies

  • gcc 7.5+ | clang 8.0+
  • cmake 3.11+
  • zlib 1.2.8+
Hidden
  • pybind11
  • lbcb-sci/racon/tree/library 3.0.2
  • rvaser/bioparser 3.0.13
  • (raven_test) google/googletest 1.10.0

Other options

NOTE: not updated for 1.8 release

Brew

Install Linuxbrew and run the following command:

brew install brewsci/bio/raven-assembler

Conda

Install conda and run the following command:

conda install -c bioconda raven-assembler

Acknowledgment

This work has been supported in part by the Genome Institute of Singapore (A*STAR), by the Croatian Science Foundation under projects Algorithms for genome sequence analysis (UIP-11-2013-7353) and Single genome and metagenome assembly (IP-2018-01-5886), and in part by the European Regional Development Fund under grant KK.01.1.1.01.0009 (DATACROSS).