• Stars
    star
    166
  • Rank 227,748 (Top 5 %)
  • Language
    Python
  • Created over 8 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A robust, extensible metagenomics pipeline

Sunbeam: a robust, extensible metagenomic sequencing pipeline

CircleCI Super-Linter Documentation Status DOI:10.1186/s40168-019-0658-x

Sunbeam is a pipeline written in snakemake that simplifies and automates many of the steps in metagenomic sequencing analysis. It uses conda to manage dependencies, so it doesn't have pre-existing dependencies or admin privileges, and can be deployed on most Linux workstations and clusters. To read more, check out our paper in Microbiome.

Sunbeam currently automates the following tasks:

  • Quality control, including adaptor trimming, host read removal, and quality filtering;
  • Taxonomic assignment of reads to databases using Kraken;
  • Assembly of reads into contigs using Megahit;
  • Contig annotation using BLAST[n/p/x];
  • Mapping of reads to target genomes; and
  • ORF prediction using Prodigal.

Sunbeam was designed to be modular and extensible. Some extensions have been built for:

  • IGV for viewing read alignments
  • KrakenHLL, an alternate read classifier
  • Kaiju, a read classifier that uses BWA rather than kmers
  • Anvi'o, a downstream analysis pipeline that does lots of stuff!

More extensions can be found at the extension page: https://github.com/sunbeam-labs.

To get started, see our documentation!

If you use the Sunbeam pipeline in your research, please cite:

EL Clarke, LJ Taylor, C Zhao, A Connell, J Lee, FD Bushman, K Bittinger. Sunbeam: an extensible pipeline for analyzing metagenomic sequencing experiments. Microbiome 7:46 (2019)

See how people are using Sunbeam:


Changelog:

v3.1.1 (October 19, 2022)

  • Upgrade manage-version.sh to work with any version (before or after 3.1) and always keep itself on the latest version
  • Allow multiple targets with sunbeam run --target_list [TARGET, ...]

v3.1.0 (October 14, 2022)

  • Upgrade snakemake to v7.15.1, this fixed two issues with running jobs on the cluster
  • Add manage-versions.sh which manages versions of sunbeam automatically for users in a single repo
  • Fix build errors for previous versions of docs introduced when trying to add versioning info from sunbeamlib
  • Improve the install script's decision making on when to use mamba vs conda and add informative errors
  • Remove 'str' calls from snakemake that are no longer necessary (snakemake <5.7 requirement)
  • Add 'Sunbeam Commands', 'Software Structure', 'install.sh', and 'manage-versions.sh' pages to docs

v3.0.1 (August 11, 2022)

  • Add biom-format to Kraken env
  • Fix sunbeam init not picking up extension configs
  • Fix multiqc naming issue that seemed to arise across different multiqc versions

v3.0.0 (June 27, 2022)

  • Support use of .smk file extensions in Sunbeam extensions (in addition to .rules)
  • Making use of snakemake's builtin features for environment management to separate dependencies and shrink environments
  • Support mamba as an alternate package dependency solver at install time, for faster installs
  • New command sunbeam extend to automatically install Sunbeam extensions! Use like sunbeam extend https://github.com/sunbeam-labs/sbx_report
  • sunbeam init and sunbeam config update now add options for extensions you've installed to your default config file! (#247)
  • Updated the path to the Illumina adapter sequences from hardcoded to templated (fixes #150 and #152)
  • Use the updated kraken2 classifier instead of kraken
  • Update other dependencies (trimmomatic -> 0.3.9; grabseqs -> 0.6.1; snakemake -> <5.7.0)
  • Use diamond instead of blastx/p for a significant speed increase

v2.1.0 (November 26, 2019)

  • Added a build manifest, which is run every time on integration testing and can be fed into conda by users to install the most recent successful dependencies
  • Updates to documentation (#169, #230, #231)
  • Fix missing samtools (#224)
  • Integration test updates to schedule weekly builds (#222)
  • Fix issues with old paired-end illumina adapters (#221)
  • Script updates to use conda commands instead of source commands (#220)
  • Add h5py package explicitly to avoid dependency metadata problem (#219)
  • Add multiQC to build QC report (#203)
  • Use multithreading for cutadapt in QC (#202)
  • Correct conda channel priority during install (#201)
  • Update documentation to spell out requirements (#199)
  • New megahit failure handling (#194)
  • Enforce sample wildcard constraints in Snakemake rules (#190)
  • Run megahit multithreaded (#189)

v2.0.2 (August 28, 2019)

  • Add implicit dependencies (samtools and bcftools) to environment file to make them explicit

v2.0.1 (July 24, 2019)

  • Increment Snakemake version requirement for compatibility with recent conda
  • Specify earlier megahit version to ensure compatbility with existing assembly behavior
  • Integration test improvements

v2.0.0 (January 22, 2019)

  • Start a project using resources directly from the SRA using sunbeam init --data_acc [SRA ###]. For more information, see the docs
  • New extension website: https://www.sunbeam-labs.org/
  • Improved documentation
  • Numerous bugfixes and optimizations

v1.2.1 (May 24, 2018)

  • Minor bugfixes

v1.2.0 (May 2, 2018)

  • Low-complexity reads are now removed by default rather than masked
  • Bug fixes related to single-end sequencing experiments
  • Documentation updates

v1.1.0 (April 8, 2018)

  • Reports include number of filtered reads per host, rather than in aggregate
  • Static binary dependency for komplexity for easier deployment
  • Remove max length filter for contigs

v1.0.0 (March 22, 2018)

  • First stable release!
  • Support for single-end sequencing experiments
  • Low-complexity read masking via komplexity
  • Support for extensions
  • Documentation on ReadTheDocs.io
  • Better assembler (megahit)
  • Better ORF finder (prodigal)
  • Can remove reads from any number of host/contaminant genomes
  • Semantic versioning checks
  • Integration tests and continuous deployment

Contributors