BC Cancer Canada's Michael Smith Genome Sciences Centre (@bcgsc)

Top repositories

1

abyss

🔬 Assemble large genomes using short reads
C++
310
star
2

NanoSim

Nanopore sequence read simulator
Python
237
star
3

ntHash

Fast hash function for DNA/RNA sequences
C++
96
star
4

RNA-Bloom

🌺 reference-free transcriptome assembly for short and long reads
Java
94
star
5

arcs

🌈Scaffold genome sequence assemblies using linked or long read sequencing data
C++
91
star
6

ntJoin

🔗Genome assembly scaffolder using minimizer graphs
Python
82
star
7

ntCard

Estimating k-mer coverage histogram of genomics data
C++
76
star
8

biobloom

Create Bloom filters for a given reference and then use it to categorize sequences
C++
75
star
9

LINKS

⛓ Long Interval Nucleotide K-mer Scaffolder
Perl
73
star
10

mirna

microRNA profiling pipeline
Perl
73
star
11

mavis

Merging, Annotation, Validation, and Illustration of Structural variants
Python
73
star
12

ntSynt

Detecting multi-genome synteny using minimizer graph mapping
Python
69
star
13

straglr

Tandem repeat expansion detection or genotyping from long-read alignments
Python
66
star
14

ntEdit

✏️ Genome assembly polishing & SNV detection
C++
64
star
15

tigmint

⛓ Correct misassemblies using linked AND long reads
Python
55
star
16

HLAminer

⛏ HLA predictions from NGS shotgun data
Perl
51
star
17

orca

🐳 Genomics Research Container Architecture
R
48
star
18

LongStitch

Correct and scaffold assemblies using long reads
Makefile
45
star
19

AMPlify

Attentive deep learning model for antimicrobial peptide prediction
Python
39
star
20

xmatchview

🗻 Visualization of genome/gene sequence synteny
Python
36
star
21

goldrush

Linear-time de novo Long Read Assembler
C++
35
star
22

transabyss

de novo assembly of RNA-seq data using ABySS
Python
34
star
23

ntLink

Minimizer-based assembly scaffolding and mapping using long reads
Python
31
star
24

pori

Platform for Oncogenomic Reporting and Interpretation (PORI)
Shell
30
star
25

RAILS

🚝RAILS and 👞🔨Cobbler: Assembly Improvement by Long Sequence Scaffolding/Gap-filling
Perl
27
star
26

SSAKE

🍶 Genome assembly with short sequence reads
Perl
24
star
27

btllib

Bioinformatics Technology Lab common code library
C++
23
star
28

kollector

de novo targeted gene assembly
Shell
22
star
29

btl_bloomfilter

The BTL C/C++ Common bloom filters for bioinformatics projects, as well as any APIs created for other programming languages.
C++
18
star
30

physlr

⛓️ Construct a Physical Map from Linked Reads
Python
18
star
31

pavfinder

🔍 Post Assembly Variants Finder
Python
17
star
32

ntEmbd

Deep learning embedding for nucleotide sequences
Python
16
star
33

chromeqc

ChromeQC: Summarize sequencing library quality of 10x Genomics Chromium linked reads
HTML
15
star
34

ntHits

Identifying repeats in high-throughput sequencing data
C++
15
star
35

ChopStitch

Finding putative exons and constructing splicegraphs using Trans-ABySS contigs
C++
12
star
36

unikseq

🧬 Unique (& conserved) DNA sequence identification
Perl
12
star
37

mtGrasp

mtGrasp: de novo reference-grade mitochondrial genome assembly and standardization
Python
12
star
38

RNA-Scoop

:shipit: interactive visualization of single-cell transcriptomes
Java
11
star
39

abyss-2.0-giab

🍼 Assemble the Genome in a Bottle sequencing data
Makefile
10
star
40

arks

ARCHIVED 🌈Alignment-free scaffolding of genome assemblies with 10x Genomics Chromium reads. ARCS/ARKS projects have been consolidated: https://github.com/bcgsc/arcs
C++
10
star
41

pori_graphkb_client

Front-end web client for the GraphKB project
TypeScript
9
star
42

pori_ipr_api

Integrated Pipeline Reports (IPR) API, the reporting API as part of the PORI platform
JavaScript
8
star
43

pori_ipr_client

Integrated Pipeline Reports (IPR) client. The web interface for the reporting application as part of PORI
TypeScript
8
star
44

pori_graphkb_python

Python adapter package for querying the GraphKB API
Python
7
star
45

tAMPer

tAMPer: antimicrobial peptides toxicity prediction
Jupyter Notebook
7
star
46

TASR

Targeted Assembly of Short Reads
Perl
7
star
47

PASS

Proteome Assembler with Short peptide Sequence
Perl
7
star
48

rAMPage

rAMPage: Rapid AMP Annotation and Gene Estimation
Shell
6
star
49

ABySS-explorer

Visualize genome sequence assemblies
Java
6
star
50

pori_graphkb_loader

The Loaders for GraphKB. Imports content from external sources via the GraphKB REST API
JavaScript
6
star
51

pori_ipr_python

Python adapter for generating reports uploaded to IPR using the PORI platform
Python
6
star
52

ntedit_sealer_protocol

Efficient targeted error resolution and automated finishing of long-read genome assemblies
Makefile
5
star
53

Trans-NanoSim

Oxford nanopore transcriptome read simulator
Python
5
star
54

Terminitor

Deep Neural Network model that predicts polyadenylation sites
Python
5
star
55

GapPredict

Character-level language model for draft genome assembly gap-filling
Python
5
star
56

goldpolish

GoldPolish (aka GoldRush-Edit) is a long read sequence polisher used in the GoldRush assembler
C++
5
star
57

pori_graphkb_schema

Shared package between the API and GUI for GraphKB which holds the schema definitions and schema-related functions
TypeScript
4
star
58

tasrkleat-TCGA-analysis-scripts

This repo stores codes for the analysis of tasrkleat results on TCGA RNA-Seq dataa
Jupyter Notebook
4
star
59

rnaseq_utils

utility scripts for RNA-seq data
Python
4
star
60

pori_graphkb_parser

A package for parsing and recreating HGVS-like variant notation used in GraphKB
TypeScript
4
star
61

pori_graphkb_api

REST API for the GraphKB project
JavaScript
4
star
62

peekseq

De novo protein-coding potential calculator using a k-mer approach
Perl
4
star
63

bloom-identity-est

These scripts provide a fast, memory-efficient method for estimating the percent sequence identity between two genomes using a probabilistic data structure called a Bloom filter
R
4
star
64

ntRoot

🌳 Human ancestry inference from genomic data
Python
4
star
65

TMBur

Derive TMB estimates from whole genome fastq files. Runs alignment and variant calling before reporting a variety of TMB estimates
Nextflow
3
star
66

Canadian_Biogenome_Project

This repo contains the pipeline used by the Canadian Biogenome Project (http://earthbiogenome.ca) to generate assemblies
Nextflow
3
star
67

gum

GUM: Group, User Manager for LDAP
Python
2
star
68

dida

DIDA Project
C++
2
star
69

sqlalchemy_hawq

Custom dialect for using SQLAlchemy with a HAWQ database which extends the postgres dialect
Python
2
star
70

graphviz-utils

GraphViz scripts
Shell
2
star
71

Mito-AssemblyViz

Mitochondrial Genome Assembly Assessment Visualization
HTML
2
star
72

qupath-annotation-exchange

An extension for QuPath for importing JSON annotations made in a GSC internal application
Java
2
star
73

pori_cbioportal

cBioportal adaptor for creating PORI reports in IPR
Python
2
star
74

tr_catalog

Tandem repeat catalog from public long-read sequence assemblies
Python
1
star
75

link_str

Analysis scripts developed for genotyping STRs in linked-read data
Python
1
star
76

ggcli

command line interface for ggplot
Python
1
star
77

abyss-organelle

Utilities to assemble an organelle genome using ABySS
R
1
star
78

long_read_pog

Analysis code for a cohort of Nanopore-sequenced tumours.
HTML
1
star
79

IMPALA

Integrated Mapping and Profiling of Allelically-expressed Loci with Annotations
R
1
star
80

lcm

lcm
Java
1
star
81

abyss-connector-paper

ABySS-Connector: Connecting Paired Sequences with a Bloom Filter de Bruijn Graph
1
star
82

picea-glauca-plastid

🌲 Annotate the plastid genome of white spruce (Picea glauca)
Shell
1
star
83

rsempipeline

A pipeline for running rsem analysis on thousands of samples
Python
1
star
84

Stash

This repository contains the implementation for the Stash data structure
C++
1
star
85

picea-engelmannii-plastid

🌲 Annotate the plastid genome of Engelmann spruce (Picea engelmannii)
Makefile
1
star
86

AMPd-Up

De novo antimicrobial peptide sequence generation with recurrent neural networks
Python
1
star
87

btl

🔬 Bioinformatics Technology Lab, Genome Sciences Centre
HTML
1
star
88

amnatate

Draft genome completeness assessment using hash based approach
C++
1
star