• Stars
    star
    292
  • Rank 141,297 (Top 3 %)
  • Language
    Python
  • License
    MIT License
  • Created over 4 years ago
  • Updated 24 days ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

MetaPhlAn is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data

MetaPhlAn: Metagenomic Phylogenetic Analysis

install with bioconda PyPI - Downloads MetaPhlAn on DockerHub Build MetaPhlAn package

An updated markers database is now available!

  • Addition of ~200k new genomes
  • 3,580 more SGBs than the vJan21
  • 2,548 genomes considered reference genomes in vJan21 were relabelled as MAGs in NCBI -> 1,550 kSGBs in vJan21 are now uSGBs in vOct22
  • Removed redundant reference genomes from the vJan21 genomic database using a MASH distance threshold at 0.1%
  • Local reclustering to improve SGB definitions of oversized or too-close SGBs
  • Improved GGB and FGB definitions by reclustering SGB centroids from scratch
  • Improved phylum assignment of SGBs with no reference genomes at FGB level using MASH distances on amino acids to find the closest kSGB

What's new in version 4

  • Adoption of the species-level genome bins system (SGBs, http://segatalab.cibio.unitn.it/data/Pasolli_et_al.html)
  • New MetaPhlAn marker genes extracted identified from ~1M microbial genomes
  • Ability to profile 21,978 known (kSGBs) and 4,992 unknown (uSGBs) microbial species
  • Better representation of, not only the human gut microbiome but also many other animal and ecological environments
  • Estimation of metagenome composed by microbes not included in the database with parameter --unclassified_estimation
  • Compatibility with MetaPhlAn 3 databases with parameter --mpa3

Full list of changes here.


Description

MetaPhlAn is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea and Eukaryotes) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With StrainPhlAn, it is possible to perform accurate strain-level microbial profiling. MetaPhlAn 4 relies on ~5.1M unique clade-specific marker genes identified from ~1M microbial genomes (~236,600 references and 771,500 metagenomic assembled genomes) spanning 26,970 species-level genome bins (SGBs, http://segatalab.cibio.unitn.it/data/Pasolli_et_al.html), 4,992 of them taxonomically unidentified at the species level (the latest marker information file can be found here), allowing:

  • unambiguous taxonomic assignments;
  • an accurate estimation of organismal relative abundance;
  • SGB-level resolution for bacteria, archaea and eukaryotes;
  • strain identification and tracking
  • orders of magnitude speedups compared to existing methods.
  • metagenomic strain-level population genomics

If you use MetaPhlAn, please cite:

Extending and improving metagenomic taxonomic profiling with uncharacterized species using MetaPhlAn 4. Aitor Blanco-Miguez, Francesco Beghini, Fabio Cumbo, Lauren J. McIver, Kelsey N. Thompson, Moreno Zolfo, Paolo Manghi, Leonard Dubois, Kun D. Huang, Andrew Maltez Thomas, Gianmarco Piccinno, Elisa Piperni, Michal Punčochář, Mireia Valles-Colomer, Adrian Tett, Francesca Giordano, Richard Davies, Jonathan Wolf, Sarah E. Berry, Tim D. Spector, Eric A. Franzosa, Edoardo Pasolli, Francesco Asnicar, Curtis Huttenhower, Nicola Segata. Nature Biotechnology (2023)

If you use StrainPhlAn, please cite the MetaPhlAn paper and the following StrainPhlAn paper:

Microbial strain-level population structure and genetic diversity from metagenomes. Duy Tin Truong, Adrian Tett, Edoardo Pasolli, Curtis Huttenhower, & Nicola Segata. Genome Research 27:626-638 (2017)


Installation

The best way to install MetaPhlAn is through conda via the Bioconda channel. If you have not configured you Anaconda installation in order to fetch packages from Bioconda, please follow these steps in order to setup the channels.

You can install MetaPhlAn by running

$ conda install -c bioconda metaphlan

For installing it from the source code and for further installation instructions, please see the Wiki at the Installation paragraph.


MetaPhlAn and StrainPhlAn tutorials and resources

In addition to the information on this page, you can refer to the following additional resources.

More Repositories

1

biobakery

bioBakery tools for meta'omic profiling
Shell
253
star
2

humann

HUMAnN is the next generation of HUMAnN 1.0 (HMP Unified Metabolic Analysis Network).
Python
152
star
3

phylophlan

Precise phylogenetic analysis of microbial isolates and genomes from metagenomes
Python
118
star
4

Maaslin2

MaAsLin2: Microbiome Multivariate Association with Linear Models
R
111
star
5

kneaddata

Quality control tool on metagenomic and metatranscriptomic sequencing data, especially data from microbiome experiments.
Python
102
star
6

biobakery_workflows

bioBakery workflows is a collection of workflows and tasks for executing common microbial community analyses using standardized, validated tools and parameters.
Python
94
star
7

graphlan

High-quality circular representations of taxonomic and phylogenetic trees
Python
84
star
8

melonnpan

Model-based Genomically Informed High-dimensional Predictor of Microbial Community Metabolic Profiles
R
33
star
9

MetaPhlAn2

Python
29
star
10

shortbred

ShortBRED is a pipeline to take a set of protein sequences, reduce them to a set of unique identifying strings ("markers"), and then search for these markers in metagenomic data and determine the presence and abundance of the protein families of interest.
Python
28
star
11

metawibele

MetaWIBELE: Workflow to Identify novel Bioactive Elements in microbiome
Python
20
star
12

SparseDOSSA2

HTML
11
star
13

halla

Python
11
star
14

homebrew-biobakery

Biobakery formulae for the Homebrew package manager
Ruby
10
star
15

anadama2

AnADAMA2 is the next generation of AnADAMA (Another Automated Data Analysis Management Application). AnADAMA is a tool to capture your workflow and execute it efficiently on your local machine or in a grid compute environment (ie. sun grid engine or slurm).
Python
10
star
16

MTX_model

R package for differential expression analysis in metatranscriptomics
R
8
star
17

banocc

TeX
6
star
18

MMUPHin

MMUPHin: Meta-analysis Methods with Uniform Pipeline for Heterogeneity in Microbiome Studies
R
6
star
19

maaslin

MaAsLin is a multivariate statistical framework that finds associations between clinical metadata and potentially high-dimensional experimental data.
R
6
star
20

anpan

R
5
star
21

sparseDOSSA

R
5
star
22

ccrepe

The CCREPE (Compositionality Corrected by REnormalizaion and PErmutation) package is designed to assess the significance of general similarity measures in compositional datasets.
R
4
star
23

ibd_paper

R
3
star
24

hmp2_analysis

R
3
star
25

meddiet2020

R
3
star
26

micropita

Python
3
star
27

cgfp

Python
3
star
28

uniref_annotator

Python
3
star
29

waafle

http://huttenhower.sph.harvard.edu/waafle
Python
3
star
30

Macarron

R
3
star
31

omnibus-and-maaslin2-rscripts-and-hmp2-data

Omnibus and Maaslin2 Rscripts and hmp2-data
R
2
star
32

humann_legacy

Python
2
star
33

maaslin2_benchmark

Large-scale Benchmarking of Microbial Multivariable Association Methods
R
2
star
34

breadcrumbs

Python
2
star
35

galaxy_lefse

Python
2
star
36

crc-subtyping-paper

R
2
star
37

parathaa

Preserving and Assimilating Region-specific Ambiguities in Taxonomic Hierarchical Assignments for Amplicons (Parathaa)
R
2
star
38

fugassem

FUGAsseM: Function predictor of Uncharacterized Gene products by Assessing high-dimensional community data in Microbiomes
Python
2
star
39

halla_legacy

Hierarchical All-against-All association testing is designed as a command-line tool to find associations in high-dimensional, heterogeneous datasets.
Python
1
star
40

MTX_synthetic

Synthetic data generation for the mtx2021 project
Python
1
star
41

jdrf2

HTML
1
star
42

pouchitis

Python
1
star
43

synmetap

Python
1
star
44

galaxy-upgrade

Python
1
star
45

ibd_meta_analysis

R
1
star
46

sparsedossa_paper

R
1
star
47

conda-biobakery

A collection of conda build recipes
Shell
1
star
48

qiimetomaaslin

Python
1
star
49

ppanini

Prioritization and Prediction of functional Annotations for Novel and Important genes via automated data Network Integration
UnrealScript
1
star
50

Physical-activity-gut-microbiome-body-weight

R
1
star
51

maaslin3

MaAsLin3: Microbiome Multivariate Association with Linear Models
R
1
star