Big Data Genomics (@bigdatagenomics)
  • Stars
    star
    1,362
  • Global Org. Rank 12,219 (Top 4 %)
  • Registered over 10 years ago
  • Most used languages
    Scala
    55.6 %
    Python
    11.1 %
    Java
    11.1 %
    Shell
    11.1 %
    HTML
    5.6 %
    Dockerfile
    5.6 %

Top repositories

1

adam

ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
Scala
967
star
2

mango

A scalable genome browser. Apache 2 licensed.
Scala
121
star
3

avocado

A Variant Caller, Distributed. Apache 2 licensed.
Scala
71
star
4

bdg-formats

Open source formats for scalable genomic processing systems using Avro. Apache 2 licensed.
Shell
38
star
5

cannoli

Distributed execution of bioinformatics tools on Apache Spark. Apache 2 licensed.
Scala
38
star
6

eggo

Ready-to-go Parquet-formatted public 'omics datasets
Python
30
star
7

rice

An RNA pipeline built on top of ADAM. Apache 2 licensed.
Scala
19
star
8

utils

General utility code used across BDG products. Apache 2 licensed.
Scala
18
star
9

awesome-adam

Awesome list of applications that extend Big Data Genomics ADAM. CC0 licensed.
11
star
10

bigdatagenomics.github.io

Web Site for the Big Data Genomics Group
HTML
10
star
11

gnocchi

Scala
6
star
12

workflows

Toil workflows for bigdatagenomics tools. Apache 2 licensed.
Python
5
star
13

bdg-services

Utility classes for wrapping services or other interfaces around a Spark/ADAM cluster. Apache 2 licensed.
Java
5
star
14

lime

Distributed Set Theory for Genomics
Scala
5
star
15

recipes

Recipes using BDG projects. Apache 2 licensed.
Shell
4
star
16

PacMin

Assembler for PacBio reads. Apache 2 licensed.
Scala
3
star
17

quinine

A refreshing treatment for all quality control ailments. Apache 2 licensed.
Scala
2
star
18

jenkins-docker

Dockerfile
1
star
19

toil-wdl-api

Exemplar API that mediates Toil with a WDL front-end and workflow tracking.
Java
1
star