pronto
A Python frontend to ontologies.
🚩 Table of Contents
🗺️ Overview
Pronto is a Python library to parse, browse, create, and export
ontologies, supporting several ontology languages and formats. It
implement the specifications of the
Open Biomedical Ontologies 1.4
in the form of an safe high-level interface. If you're only interested in
parsing OBO or OBO Graphs document, you may wish to consider
fastobo
instead.
🏳️ Supported Languages
- Open Biomedical Ontologies 1.4. Because this format is fairly new, not all OBO ontologies can be parsed at the moment. See the OBO Foundry roadmap listing the compliant ontologies, and don't hesitate to contact their developers to push adoption forward.
- OBO Graphs in JSON format. The format is not yet stabilized to the results may change from file to file.
- Ontology Web Language 2 in RDF/XML format. OWL2 ontologies are reverse translated to OBO using the mapping defined in the OBO 1.4 Semantics.
🔧 Installing
Installing with pip
is the easiest:
# pip install pronto # if you have the admin rights
$ pip install pronto --user # install it in a user-site directory
There is also a conda
recipe in the bioconda
channel:
$ conda install -c bioconda pronto
Finally, a development version can be installed from GitHub
using setuptools
, provided you have the right dependencies
installed already:
$ git clone https://github.com/althonos/pronto
$ cd pronto
# python setup.py install
💡 Examples
If you're only reading ontologies, you'll only use the Ontology
class, which is the main entry point.
>>> from pronto import Ontology
It can be instantiated from a path to an ontology in one of the supported formats, even if the file is compressed:
>>> go = Ontology("tests/data/go.obo.gz")
Loading a file from a persistent URL is also supported, although you may also
want to use the Ontology.from_obo_library
method if you're using persistent
URLs a lot:
>>> cl = Ontology("http://purl.obolibrary.org/obo/cl.obo")
>>> stato = Ontology.from_obo_library("stato.owl")
🏷️ Get a term by accession
Ontology
objects can be used as mappings to access any entity
they contain from their identifier in compact form:
>>> cl['CL:0002116']
Term('CL:0002116', name='B220-low CD38-positive unswitched memory B cell')
Note that when loading an OWL ontology, URIs will be compacted to CURIEs whenever possible:
>>> aeo = Ontology.from_obo_library("aeo.owl")
>>> aeo["AEO:0000078"]
Term('AEO:0000078', name='lumen of tube')
🖊️ Create a new term from scratch
We can load an ontology, and edit it locally. Here, we add a new protein class to the Protein Ontology.
>>> pr = Ontology.from_obo_library("pr.obo")
>>> brh = ms.create_term("PR:XXXXXXXX")
>>> brh.name = "Bacteriorhodopsin"
>>> brh.superclasses().add(pr["PR:000001094"]) # is a rhodopsin-like G-protein
>>> brh.disjoint_from.add(pr["PR:000036194"]) # disjoint from eukaryotic proteins
✏️ Convert an OWL ontology to OBO format
The Ontology.dump
method can be used to serialize an ontology to any of the
supported formats (currently OBO and OBO JSON):
>>> edam = Ontology("http://edamontology.org/EDAM.owl")
>>> with open("edam.obo", "wb") as f:
... edam.dump(f, format="obo")
🌿 Find ontology terms without subclasses
The terms
method of Ontology
instances can be used to
iterate over all the terms in the ontology (including the
ones that are imported). We can then use the is_leaf
method of Term
objects to check is the term is a leaf in the
class inclusion graph.
>>> ms = Ontology("ms.obo")
>>> for term in ms.terms():
... if term.is_leaf():
... print(term.id)
MS:0000000
MS:1000001
...
🤫 Silence warnings
pronto
is explicit about the parts of the code that are doing
non-standard assumptions, or missing capabilities to handle certain
constructs. It does so by raising warnings with the warnings
module,
which can get quite verbose.
If you are fine with the inconsistencies, you can manually disable
warning reports in your consumer code with the filterwarnings
function:
import warnings
import pronto
warnings.filterwarnings("ignore", category=pronto.warnings.ProntoWarning)
📖 API Reference
A complete API reference can be found in the
online documentation, or
directly from the command line using pydoc
:
$ pydoc pronto.Ontology
📜 License
This library is provided under the open-source MIT license. Please cite this library if you are using it in a scientific context using the following DOI: 10.5281/zenodo.595572