• Stars
    star
    2,450
  • Rank 18,226 (Top 0.4 %)
  • Language
    Python
  • Created about 4 years ago
  • Updated 7 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Reverse engineering SARS-CoV-2

Reverse engineering the coronavirus (SARS-CoV-2)

Start here: corona.py

πŸ’­ Background

This project applies techniques from reverse engineering to understand the SARS-CoV-2 virus. The goal here is simply to build an understanding of the virus from first principles.

Biology vs. software

Biological systems are fundamentally information processing systems. While not a perfect analogy, software provides a useful framework for thinking about biology. The table below provides a rough outline of this analogy.

πŸ”¬ Biology πŸ’» Software Notes
nucleotide byte
genome bytecode
translation disassembly 3 byte wide instruction set with arbitrary "reading frames"
protein function a polyprotein is a function with multiple pieces
protein secondary structure basic blocks 80% accuracy in prediction
protein tertiary structure This seems like the hard one to predict: https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0205819
quaternary structure compiled function with inlining https://en.wikipedia.org/wiki/Protein%E2%80%93protein_interaction_prediction
gene library bacteria are statically linked, viruses are dynamically linked
transcription loading
protein structure prediction library identification
genome analysis static analysis
molecular dynamics simulations of protein folding dynamic analysis Simulation doesn't seem to work yet. Constrained by tooling and compute.
no equivalent execution We are reverse engineering a CAD format. Runs more like FPGA code, all at once. No serial execution. (What are the FPGA reverse engineering tools?)

πŸ”§ Progress

Downloading the SARS-CoV-2 genome

GenBank is the NIH genetic sequence database, an annotated collection of all publicly available DNA and RNA sequences. The SARS-CoV-2 sequences available in GenBank have been downloaded in download_sequences.py.

Translating RNA to proteins

lib.py contains a function translate that converts an RNA sequence to a chain of amino acids. This function is used in corona.py.

Annotating functions

The translate function is used in corona.py to identify and annotate functions for all proteins encoded by the genome.

Folding proteins

The OpenMM toolkit is used for molecular simulation of protein folding in fold.py.

πŸ’‘ Work to be done

  • Automatic extraction of genes from different coronaviruses
  • Good multisequence compare tool
  • Molecular dynamics?
  • Secondary Structure prediction on orf1a?

❓ Open questions

πŸ’§ Testing

How tests work

Homemade test?

πŸ’Š Possible treatments and prophylactics

⚠️ Disclaimer: The information in this repository is for informational purposes only. It is not medical advice.

Hydroxychloroquine + zinc

RdRP inhibitors

Dexamethasone

Lopinavir-Ritonavir (AIDS cocktail)

πŸ“š Resources

Coronavirus-related publications

Biology

Bioinformatics

Epidemic modeling

Antibodies

Masks

Vaccines

Genome studies (what genes = bad covid)

DNA Synthesis

More Repositories

1

qira

QEMU Interactive Runtime Analyser
C
3,806
star
2

fromthetransistor

From the Transistor to the Web Browser, a rough outline for a 12 week course
3,512
star
3

minikeyvalue

A distributed key value store in under 1000 lines. Used in production at comma.ai
Go
2,791
star
4

ai-notebooks

Some ipython notebooks implementing AI algorithms
Jupyter Notebook
959
star
5

twitchslam

A toy implementation of monocular SLAM written while livestreaming
Python
941
star
6

configuration

Like some files bro
Haskell
379
star
7

tinyvoice

Letting computers listen to you and really care
Jupyter Notebook
361
star
8

twitchchess

like twitchslam, for chess
Python
349
star
9

lolrecaptcha

We try to break the recaptcha for the Merry Christmas for all!
Go
292
star
10

mergesorts

mergesort in many languages
Shell
254
star
11

twitchcore

It's a core. Made on Twitch.
Verilog
229
star
12

cuda_ioctl_sniffer

Sniff CUDA ioctls
C
147
star
13

eda-reversing

The Embedded Disassembler
C++
110
star
14

kvm-kext

An implementation of /dev/kvm for Mac OS X
C
108
star
15

twitchcoq

It's a poorly named metamath verifier
Prolog
104
star
16

twitchtactoe

Tic Tac Toe in React because it is Simple Skills Sunday
JavaScript
102
star
17

battlechess

A distributed decentralized chess tournament
Python
99
star
18

tinyxxx

tiny corporation website
HTML
96
star
19

hammer-website

HTML
71
star
20

edgetpuxray

Enabling tinygrad compatibility with the Google Edge TPU
C++
68
star
21

pie

Computing digits of pi for the people
JavaScript
68
star
22

eda-2

Even better than eda-reversing...I hope
C++
61
star
23

haskell-scheme

Writing Scheme in Haskell
Haskell
58
star
24

twitchctw

compression = AI
Python
53
star
25

coq-hardy

Formalizing the Theorems from Hardy's "An Introduction to the Theory of Numbers" in coq
Coq
52
star
26

freethedsp

For winners only. Are you a winner?
C
40
star
27

twitchcoins

Python
36
star
28

openhexagon

An attempt at an open source toolchain for the Hexagon DSP
Shell
35
star
29

crappycase

So many shitty coders: Adobe, Blizzard, Valve. This is a case insensitivity emulator.
C
29
star
30

body_loop

comma body does a loop around the office
Python
28
star
31

amdgpu-dkms

Unpacking AMD's dkms packages
C
25
star
32

jenkyiphonetools

iPhone Tools of the lowest quality
Python
25
star
33

lowqualityraytracer

ever wonder how to raytrace? me too. i love america
Python
25
star
34

commaled

comma.ai LED controller cause the car needs some lights bro. SWAG!
Assembly
25
star
35

trinity-osxnew

C
22
star
36

aes_serial

There is so much swag in the world, just some of it is hidden -- Gandalf
C
17
star
37

eda-3

eda-3 from many years ago
JavaScript
13
star
38

collfun

It's Christmas time, you know what it is
Python
11
star
39

nnweights

6
star
40

7900xtx

5
star
41

gpysieve

ghetto sieves in python that don't work
Python
4
star
42

angr-travis

Run travis-ci testing on release version of angr
Shell
4
star
43

tt06-fp4-mac

FP4 MAC Array
Tcl
3
star
44

tt-twitch

tenstorrent kernel from twitch
C++
2
star