• Stars
    star
    1,351
  • Rank 34,763 (Top 0.7 %)
  • Language
    Python
  • License
    MIT License
  • Created almost 5 years ago
  • Updated 2 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Nextstrain build for novel coronavirus SARS-CoV-2

GitHub release (latest by date) See recent changes

About

This repository analyzes viral genomes using Nextstrain to understand how SARS-CoV-2, the virus that is responsible for the COVID-19 pandemic, evolves and spreads.

We maintain a number of publicly-available builds, visible at nextstrain.org/ncov.

See our change log for details about backwards-incompatible or breaking changes to the workflow.

Visit the workflow documentation for tutorials and reference material.

Download formatted datasets

The hCoV-19 / SARS-CoV-2 genomes were generously shared via GISAID. We gratefully acknowledge the Authors, Originating and Submitting laboratories of the genetic sequence and metadata made available through GISAID on which this research is based.

In order to download the GISAID data to run the analysis yourself, please see this guide.

Please note that data/metadata.tsv is no longer included as part of this repo. However, we provide continually-updated, pre-formatted metadata & fasta files for download through GISAID.

Read previous Situation Reports

We issued weekly Situation Reports for the first ~5 months of the pandemic. You can find the Reports and their translations here.

FAQs

  • Can't find your sequences in Nextstrain? Check here for common reasons why your sequences may not be appearing. You can also use clades.nextstrain.org to perform some basic quality control on your sequences. If they are flagged by this tool, they will likely be excluded by our pipeline.
  • For information about how clades are defined, and the currently named clades, please see here. To assign clades to your own sequences, you can use our clade assignment tool at clades.nextstrain.org.

Bioinformatics notes

Site numbering and genome structure uses Wuhan-Hu-1/2019 as reference. The phylogeny is rooted relative to early samples from Wuhan. Temporal resolution assumes a nucleotide substitution rate of 8 × 10^-4 subs per site per year. There were SNPs present in the nCoV samples in the first and last few bases of the alignment that were masked as likely sequencing artifacts.

Contributing

We welcome contributions from the community! Please note that we strictly adhere to the Contributor Covenant Code of Conduct.

Contributing to software or documentation

Please see our Contributor Guide to get started!

Contributing data

Please note that we automatically pick up any SARS-CoV-2 data that is submitted to GISAID.

If you're a lab and you'd like to get started sequencing, please see:


Get in touch

To report a bug, error, or feature request, please open an issue.

For questions, head over to the discussion board; we're happy to help!

More Repositories

1

auspice

Web app for visualizing pathogen evolution
JavaScript
288
star
2

augur

Pipeline components for real-time phylodynamic analysis
Python
214
star
3

nextclade

Viral genome alignment, mutation calling, clade assignment, quality checks and phylogenetic placement
Rust
211
star
4

nextstrain.org

The Nextstrain website
JavaScript
88
star
5

seasonal-flu

Scripts. config, and snakefiles for seasonal-flu nextstrain builds
Python
44
star
6

mpox

Nextstrain build for mpox virus
Python
42
star
7

ncov-ingest

A pipeline that ingests SARS-CoV-2 (i.e. nCoV) data from GISAID and Genbank, transforms it, stores it on S3, and triggers Nextstrain nCoV rebuilds.
Python
35
star
8

fauna

RethinkDB database to support real-time virus analysis
Python
33
star
9

nextclade_data

Datasets for https://github.com/nextstrain/nextclade
Python
29
star
10

cli

The Nextstrain command-line interface (CLI)—a program called nextstrain—which aims to provide a consistent way to run and visualize pathogen builds and access Nextstrain components like Augur and Auspice across computing environments such as Docker, Conda, and AWS Batch.
Python
25
star
11

avian-flu

Nextstrain build for avian influenza viruses
Python
19
star
12

auspice.us

JavaScript
13
star
13

docs.nextstrain.org

Umbrella documentation project for Nextstrain
Python
13
star
14

phyloTree

interactive phylogenetic tree viewer
JavaScript
11
star
15

docker-base

Docker image build for nextstrain/base
Shell
10
star
16

zika-tutorial

Data and scripts for Zika virus tutorial
Python
10
star
17

cov

Coronavirus builds
Python
8
star
18

zika

Nextstrain build for Zika virus
Python
8
star
19

.github

Shell
8
star
20

dengue

Nextstrain build for dengue virus
Python
8
star
21

ncov-clades-schema

Renders SVG schema of SARS-CoV-2 clade as defined by Nextstrain
JavaScript
8
star
22

forecasts-ncov

SARS-CoV-2 variant growth rates and frequency forecasts
Python
7
star
23

rsv

Workflow for RSV analyses on Nextstrain.org
Python
6
star
24

janus

Build and deploy Nextstrain
Python
6
star
25

zika-tutorial-nextflow

Nextflow based pipeline for Zika tutorial
Nextflow
6
star
26

flu_frequencies

Flu clade and mutation frequencies
TypeScript
3
star
27

translations

Repo for translations of nextstrain website and COVID-19 weekly situation reports.
3
star
28

ebola

Nextstrain build for Ebola virus
Python
3
star
29

react-sidebar

JavaScript
2
star
30

mumps

Nextstrain build for mumps virus
Python
2
star
31

mers

Nextstrain build for MERS-CoV
Python
2
star
32

tb

Nextstrain build for tuberculosis
Python
2
star
33

mers-beast-tutorial

Tutorial to visualize MERS-CoV tree from BEAST in auspice
Python
2
star
34

flora

DB management, APIs, web portals etc.
Python
2
star
35

readthedocs-cli

Sync RTD project redirects from a YAML file
Python
2
star
36

WNV

the repository used to build West Nile Virus for nextstrain
Python
2
star
37

conda

Nextstrain environment for the Conda package manager
Shell
2
star
38

narratives

Narrative markdown files accessed via nextstrain.org/narratives/x
Shell
2
star
39

nextclade_dataset_template

Template repository to generate nextclade datasets
Python
1
star
40

conda-base

Conda package build for nextstrain-base
Shell
1
star
41

whitepaper

White paper describing Nextstrain
TeX
1
star
42

augurlinos

A collection of modules for molecular epidemiology
Python
1
star
43

simplest-build

Python
1
star
44

forecasts-viz

JavaScript
1
star
45

ncov-simple

Simplified ncov (SARS-CoV-2) workflow.
Python
1
star
46

wdl-debug

WDL debugging workflows
WDL
1
star
47

sacra

Cleaning scripts for real-time pathogen analysis
Python
1
star
48

gisaid-dupes

Rust
1
star
49

enterovirus_d68

Enterovirus-D68 phylogenetic analyses code - actively maintained
Python
1
star
50

seasonal-cov

Nextstrain build for seasonal coronaviruses
Python
1
star