pydna
Planning genetic constructs with many parts and assembly steps, such as recombinant
metabolic pathways
The pydna python package provide a human-readable formal descriptions of
A cloning strategy expressed in pydna is complete, unambiguous and stable.
Pydna provides simulation of:
- Primer design
- PCR
- Restriction digestion
- Ligation
- Gel electrophoresis of DNA with generation of gel images
- Homologous recombination
- Gibson assembly
- Golden gate assembly
Virtually any sub-cloning experiment can be described in pydna, and its execution yield the sequences of intermediate and final DNA molecules.
Pydna has been designed with the goal of being understandable for biologists with only some basic understanding of Python.
Pydna can formalize planning and sharing of cloning strategies and is especially useful for complex or combinatorial DNA molecule constructions.
To get started, I have compiled some simple examples. For more elaborate use, look at some assembly strategies of D-xylose metabolic pathways MetabolicEngineeringGroupCBMA/ypk-xylose-pathways.
Usage
Most pydna functionality is implemented as methods for the double stranded DNA sequence record classes Dseq and Dseqrecord, which are subclasses of the Biopython Seq and SeqRecord classes.
These classes make PCR primer design, PCR simulation and cut-and-paste cloning very simple:
As the example above shows, pydna keeps track of sticky ends and features.
Pydna can be very compact. The eleven lines of Python below simulates the construction of a recombinant plasmid. DNA sequences are downloaded from Genbank by accession numbers that are guaranteed to be stable over time.
from pydna.genbank import Genbank
gb = Genbank("[email protected]") # Tell Genbank who you are!
gene = gb.nucleotide("X06997") # Kluyveromyces lactis LAC12 gene for lactose permease.
from pydna.parsers import parse_primers
primer_f,primer_r = parse_primers(''' >760_KlLAC12_rv (20-mer)
ttaaacagattctgcctctg
>759_KlLAC12_fw (19-mer)
aaatggcagatcattcgag ''')
from pydna.amplify import pcr
pcr_prod = pcr(primer_f,primer_r, gene)
vector = gb.nucleotide("AJ001614") # pCAPs cloning vector
from Bio.Restriction import EcoRV
lin_vector = vector.linearize(EcoRV)
rec_vec = ( lin_vector + pcr_prod ).looped()
Pydna can automate the simulation of sub cloning experiments using python. This is helpful to generate examples for teaching purposes.
Read the documentation (below) or the cookbook with example files for further information.
Feedback & suggestions are very welcome! Please post a message in the google group for pydna if you need help or have problems, questions or comments
Who is using pydna?
Taylor, L. J., & Strebel, K. (2017). Pyviko: an automated Python tool to design gene knockouts in complex viruses with overlapping genes. BMC Microbiology, 17(1), 12. PubMed
Wang, Y., Xue, H., Pourcel, C., Du, Y., & Gautheret, D. (2021). 2-kupl: mapping-free variant detection from DNA-seq data of matched samples. In Cold Spring Harbor Laboratory (p. 2021.01.17.427048). DOI PubMed
An Automated Protein Synthesis Pipeline with Transcriptic and Snakemake
and other projects on github
There is an open access paper in BMC Bioinformatics describing pydna:
Please reference the above paper:
Pereira, F., Azevedo, F., Carvalho, Γ., Ribeiro, G. F., Budde, M. W., & Johansson, B. (2015). Pydna: a simulation and documentation tool for DNA assembly strategies using python. BMC Bioinformatics, 16(142), 142.
When using pydna.
Documentation
Documentation is built using Sphinx from docstrings in the code and displayed at readthedocs
The numpy docstring format is used.
Installation using pip
Pip is included in recent Python versions and is the officially recommended tool.
Pip installs the minimal installation requirements automatically, but not the optional requirements (see below).
pip install pydna
or use the --pre switch to get the latest version of pydna.
pip install pydna --pre
for optional functionality do:
pip install pydna[gel,download,express,gui]
Remove options inside the square brackets as required, but be sure not to leave spaces as pip will not recognize the options. See below under "Optional dependencies".
Windows:
You should be able to pip install pydna from the Windows terminal as biopython now can be installed with pip as well.
C:\> pip install pydna
By default python and pip are not on the PATH. You can re-install Python and select this option during installation, or give the full path for pip. Try something like this, depending on where your copy of Python is installed:
C:\Python37\Scripts\pip install pydna
Source Code
Pydna is developed on Github . I am happy to collaborate on new features or bugfixes.
Minimal installation dependencies
The list below is the minimal requirements for installing pydna. Biopython and pydivsufsort has c-extensions, but the other modules are pure python.
The above modules are installed as well as pyperclip and pyfiglet. Pydna is importable even without these two modules.
Optional dependencies
If the modules listed below in the first column are installed, they will provide the functionality listed in the second column.
Dependency | Function in pydna |
---|---|
scipy | gel simulation with pydna.gel |
matplotlib | β |
pillow | β |
numpy | " |
pyparsing | fix corrupt Genbank files with pydna.genbankfixer |
requests | download sequences with pydna.download |
cai2 | codon adaptation index calculations in several modules |
pyqt5 | future plan for gui |
pyperclip | copy sequence to clipboard |
pyfiglet | print nice logotype (pydna.logo() |
Requirements for running tests, coverage and profiling
for instance by pip install pytest pytest-cov pytest-doctestplus pytest-profiling coverage nbval requests-mock
Running the antire test suite also require:
- scipy
- matplotlib
- pillow
- pyparsing
- requests
- cai2
- pyqt5
That can be installed by pip install pydna[gel,gui,download,express]
or by pip install scipy matplotlib pillow pyparsing requests cai2 pyqt5
Releases
See the releases for changes and releases.
Automatic testing & Release process
There are two github actions for this package:
pydna_test_and_coverage_workflow.yml
pydna_pypi_build_workflow.yml
The pydna_test_and_coverage_workflow.yml
is triggered on all pushed commits for all branches.
This workflow run tests, doctests and a series of Jupyter notebooks using pytest on Linux, Windows and macOS and all
supported python versions.
The other workflow builds a PyPI packages using poetry on
These are triggered by publishing a github release manually from the github interface.
Poetry
Building a PyPI package with-
Commit changes to git
-
Tag the commit according to the Semantic Versioning format, for example "v2.0.1a3". Do not forget the "v" or poetry will not recognize the tag.
git tag v2.0.1a3
-
Pydna uses the poetry poetry-dynamic-versioning plugin.
poetry dynamic-versioning # This sets the version number in the source files
-
Verify the version
poetry version
-
Build package:
poetry build # run this command in the root directory where the pyproject.toml file is located
-
Verify the filename of the files in the dist/ folder, they should match
-
Publish to pypi
poetry publish
History
Pydna was made public in 2012 on Google code.