Learning Bioinformatics At Home
Some resources gathered by the Harvard Informatics group and other contributors to help people learn bioinformatics tools (basic and specialized) at home.
Table of content
- Unix
- R
- Python
- Statistics
- Git and version control
- RNA-seq
- Single-cell Analysis
- Read mapping
- Variant calling
- Miscellaneous
Unix
Introduction
- Greg Wilson's YouTube videos on the Unix shell
- Introduction to the Command Line for Genomics
- A short introduction to grep
- Lockdown Learning - Bioinformatics: Daily livestreamed bioinformatics lessons, assuming no prior experience.
Intermediate
Advanced
-
Use the Unofficial Bash Strict Mode (Unless You Looove Debugging)
-
Process Substitution
-
One-Liners for Bioinformatics
R
Introduction
- Data Analysis for the Life Sciences Series: A course by Rafael Irizarry at Dana Farber
- Introduction to Computational Biology: Tutorial by Mike Love, the author of DESeq2 and other R packages
Intermediate
- Introduction to Bioconductor: The structure, annotation, normalization, and interpretation of genome scale assays (free edX course)
- R for data science: by Hadley Wickham, developer of Tidyverse and many other things
Advanced
- Advanced Bioconductor: Learn advanced approaches to genomic visualization, reproducible analysis, data architecture, and exploration of cloud-scale consortium-generated genomic data (free edX course)
- Advanced R: Advanced course by Hadley Wickham
- Bioconductor courses and conferences: Overview of Bioconductor training resources
Python
Introduction
- Using Python for Research A collection of links to YouTube videos; scroll to the bottom.
- Biology Meets Programming: Bioinformatics for Beginners
Intermediate
- Intermediate Python
- Checkio: Python coding game, great for practice!
Advanced
Statistics
- Modern Statistics for Modern Biology: Book by Susan Holmes and Wolfgang Huber
Git and version control
-
Happy Git and GitHub for the useR: A book by Jenny Bryan
-
paper:A Quick Introduction to Version Control with Git and GitHub
-
paper:Ten Simple Rules for Taking Advantage of Git and GitHub
-
git in practise: An opinionated intermediate/advanced Git book
-
Fixing Problems: Git is hard, and screwing up is easy, and figuring out how to fix your mistakes is impossible. Git documentation has this chicken and egg problem where you can't search for how to get yourself out of a mess, unless you already know the name of the thing you need to know about in order to fix your problem. Here are resources to help figure out what to do when things go wrong.
- oh shit git!
- How to undo (almost) anything with Git
- A guide for astronauts (now, programmers using Git) about what to do when things go wrong: git flight rules](https://github.com/k88hudson/git-flight-rules)
RNA-seq
Single-cell Analysis
Read mapping
Variant calling
- Harvard Informatics fastq to VCF. Targeted for non-model organisms.
- RAD-seq tutorials from dDocent on Reference Assembly and SNP filtering
Miscellaneous
- Understanding snakemake An overview from Vince Buffalo, author of Bioinformatics Data Skills
- A Primer for Computational Biology A nice book.
- Fundamentals of Data Visualization: Claus Wilke's book on data visualization, covers principles and figure design.
- gcp-for-bioinformatics a repo with patterns for using the public cloud for bioinformatics, uses GCP, but patterns can be applied to other public cloud vendors, i.e. AWS, Azure....
- SLiM workshops Extensive tutorial for using SLiM, a population genetic simulation environment