Google Cloud Platform (GCP) for Bioinformatics
This repository shows how to use Google Cloud Platform (GCP) public cloud services to scale sets of bioinformatics data analysis tasks. This Repo uses cloud best practices for GCP. All examples use genomic sample (input) data, tools and pipelines. Use cases included here as examples are called by any and all of the following terms:
- genomic-scale data workflows or pipelines
- bioinformatics primary, secondary or tertiary analysis
- distributed cloud-based batch jobs
This content is intended for researchers - in particular, this guide is for those who are NEW to working with GCP. You have a number of options on how to use the materials provided in this course. A summary is shown below left.
This Repo includes content you can read, watch or run:
- π READ - one page of this Repo (MD page)
πΊ WATCH - linked YouTube screencastsπ RUN - Jupyter Notebook example- TRY - linked GitHub Repos
- π EXPAND - linked (external) resources
π SCAN - search a list in this Repo
πΊ Click below to WATCH 'Lynn's Welcome Video' (4 min) on YouTube
Why would I choose to use a public cloud vendor for bioinformatics?
βοΈ SAVE TIME use vendor-managed infrastructure & best-practice patterns for fast repeatable research
π READ Nature article: "Cloud computing for genomic data analysis and collaboration"
Bioinformatics wanting more advanced GCP content?
If you would like to learn more advanced concepts (including script examples and patterns) about working with Google Cloud Platform, see my Repo gcp-essentials
--> link
New to Bioinformatics?
If you are NEW to bioinformatics and have a computational background...
- REVIEW my bioinformatics concepts tools and terms
Contibutions
We love contributions! See this short style guide when making pull requests to this repo.