• Stars
    star
    177
  • Rank 214,754 (Top 5 %)
  • Language
    HTML
  • License
    MIT License
  • Created about 6 years ago
  • Updated about 1 month ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Robs manual for the computational genomics and bioinformatics class.

Edwards Lab DOI

ComputationalGenomicsManual

Robs manual for the computational genomics and bioinformatics class

About this manual

Rob, Liz Dinsdale, Tom Jeffries, Bruno Gomez-Gil, Jim Mitchell, and several other colleagues and friends have been teaching genomics and metagenomics for a long time. They have written this manual over the course of several years, and in a variety of formats. Rob moved it to markdown using GitHub in Fall 2018 as part of his computational genomics class.

You can view this manual online

Companion Videos

Companion videos that accompany this class are available on You Tube on Rob's YouTube Playlist.

Chapter Index

Chapter Contents HTML PDF
1. Linux HTML PDF
2. Conda HTML PDF
3. Snakemake HTML PDF
4. Sequencing Overview HTML PDF
5. Sequence File Formats HTML PDF
6. Sequence Quality Control HTML PDF
7. Databases HTML PDF
7a. - NCBI Edirect HTML PDF
7b. - NCBI SRA HTML PDF
8. Genome Sequencing Overview HTML PDF
9. Sequence Assembly HTML PDF
10. ORF Calling HTML PDF
11. tRNA and rRNA identification HTML PDF
12. Annotation Pipelines HTML PDF
13. Metagenomics HTML PDF
14. - Example Data Sets HTML PDF
15. Cross Assembly HTML PDF
15a. - Metabat HTML PDF
15b. - CCOM HTML PDF
16. 16S sequencing HTML PDF
17. FOCUS HTML PDF
18. Kraken HTML PDF
19. SUPER-FOCUS HTML PDF
20. GenomePeek HTML PDF
21. RTMg HTML PDF
22. OrfM and the SEED HTML PDF
23. ANVI'O HTML PDF
24. CheckM HTML PDF

Workshops.

We are using this content in a variety of workshops

Assignments.

Solutions are still not shown, but you can work through some of these

Datasets

We have several different datasets available for you to use to try the course work out. There are both 16S and random metagenomes, and links to genomics data.

PDFs

Note: The PDFs are automatically created from the markdown, and loose some of the images and links. You should probably use the HTML version most of the time.

About Copyright Information

Some of the images used in this manual are currently copyright other people. As noted above, Rob and friends wrote this manual over many years and added the images and cartoons to lighten the manual. We are in the process of identifying the copyright holders and/or identifying images that are not copyrighted. If your rights have been infringed upon, if you would like to provide an indemnification, or if you would like to provide a non-copyrighted image, please contact Rob.

Copyright

This manual is Copyright Robert A. Edwards. 2018.

Citation

If you wish to cite this manual, please cite: Edwards, R. 2018. Computational Genomics. https://linsalrob.github.io/ComputationalGenomicsManual/. Accessed [today's date] DOI: 10.5281/zenodo.7883375

References

We have an extensive list of references available, but if you find something missing that we should have cited (a) we're sorry, we tried to remember all of them and (b) please email Rob or provide a pull request and we'll add it.

More Repositories

1

fastq-pair

Match up paired end fastq files quickly and efficiently.
C
139
star
2

PhiSpy

Prediction of prophages from bacterial genomes
Jupyter Notebook
70
star
3

EdwardsLab

Code from the Edwards lab, including bioinformatics, image analysis and more. All this code is created and maintained by folks at Rob Edwards' bioinformatics lab at Flinders University
Jupyter Notebook
38
star
4

SRA_Metadata

Get, parse, and extract information from the SRA metadata files
Python
36
star
5

PyFBA

A python implementation of flux balance analysis to model microbial metabolism
Python
25
star
6

ProphagePredictionComparisons

Comparisons of multiple different prophage predictions
Jupyter Notebook
24
star
7

partie

PARTIE is a program to partition sequence read archive (SRA) metagenomics data into amplicon and shotgun data sets. The user-supplied annotations of the data sets can not be trusted, and so PARTIE allows automatic separation of the data.
Perl
24
star
8

sphae

Phage annotations and predictions. A spae is a prediction or foretelling. We'll foretell you what your phage is doing!
Python
23
star
9

crAssphage

Sequencing and analysis of crAssphage regions from around the globe
Python
16
star
10

PhageHosts

This is the complete code base used in Robert A. Edwards, Katelyn McNair, Karoline Faust, Jeroen Raes, and Bas E. Dutilh (2015) Computational approaches to predict bacteriophage–host relationships. FEMS Microbiology Reviews doi: 10.1093/femsre/fuv048
Python
15
star
11

genbank_to

Convert genbank files to a swath of other formats
Python
13
star
12

mgi-adapters

Trim adapters from MGI sequence data
C
9
star
13

primer-trimming

Fast C code for identifying and removing primers and adapters
C
9
star
14

fasta_validator

C code to validate a fasta file
C
8
star
15

py_fasta_validator

A Python extension of the fasta validator
Python
7
star
16

PhispyAnalysis

Analysis of phispy data
Jupyter Notebook
4
star
17

PhageProteomicTree

The phage proteomic tree was a breakthrough in evolution, taxonomy, and phylogenetics ... but nobody realized its global importance
Perl
4
star
18

pyctv

Parse and incorporate the ICTV Virus Metadata Resource file
Python
3
star
19

cameraGUI

A repository for the Prosilica Camera GUI
Python
3
star
20

repeatfinder

fast code for searching for direct and indirect repeats in DNA sequences.
Python
3
star
21

SearchSRA

Tools to search through the Sequence Read Archive using XSEDE's Jetstream
Shell
3
star
22

SearchSRAToolKit

Tools for processing data generated by the Search SRA
Python
3
star
23

atavide_lite

A simpler version of atavide that relies only on slurm or PBS scripts. Some of the settings maybe specific for our compute resources
Python
2
star
24

qudaich

Qudaich (queries and unique database alignment inferred by clustering homologs) is a software package for aligning sequences.
C++
2
star
25

atavide

Atavistic processing of metagenomics data.
Python
2
star
26

CoralImageAnalysis

A central repo for all the coral image analysis code generated by the Edwards bioinformatics lab at San Diego State University
Python
2
star
27

genetic_codes

Python code for translating sequences using different NCBI translation tables and genetic codes.
C
2
star
28

get_orfs

C code to translate a DNA sequence using different translation tables. Designed to be fast and lightweight, with few dependencies (only zlib and pthreads)
C
2
star
29

seqc

C and C++ libraries for working with sequences
1
star
30

CommonWorkflowLanguage

CWL codes and examples for searchSRA.org
Common Workflow Language
1
star
31

PPPF

Probabilistic Phage Protein Functions
Python
1
star
32

UCCD

Using Machine Language to Compare Ulcerative Colitis and Crohn's Disease
Jupyter Notebook
1
star
33

pawsey

Code for running lots of different things on pawsey. This is a bit of a generic bucket and some of the code will be duplicated elsewhere in different projects
Python
1
star
34

acacia-stream

testing some streaming using minimap and acacia and something
Python
1
star
35

pbj_placer

Rewrite jplacer files from phylosift for importing into ITOL and other tree viewing software
Python
1
star
36

PhiSigns

PhiSiGns is a web-based and standalone application that provides a simple and convenient tool to identify signature genes and design primers for PCR amplification of related genes from environmental samples
Perl
1
star