• Stars
    star
    12
  • Rank 1,597,372 (Top 32 %)
  • Language
    Python
  • License
    GNU Affero Genera...
  • Created over 11 years ago
  • Updated over 9 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Python 3 tools for data mining in molecular biology

More Repositories

1

syntok

Text tokenization and sentence segmentation (segtok v2)
Python
201
star
2

segtok

Segtok v2 is here: https://github.com/fnl/syntok -- A rule-based sentence segmenter (splitter) and a word tokenizer using orthographic features.
Python
170
star
3

pymonad

"fork" of PyMonad on BitBucket to change the ``*`` functor/composition operator to ``<<``
Python
31
star
4

patricia-trie

a pure-Python PATRICIA trie implementation.
Python
31
star
5

medic

a Python 3 command-line tool to maintain a DB mirror of MEDLINE (https://pypi.python.org/pypi/medic) - ALERT: As I have moved out of science and am working as a consultant now, this project might need a new maintainer once PubMed changes its XML format. Heroes?
Python
25
star
6

progress_bar

an informative progress bar for Python 2+3 command-line tools
Python
13
star
7

asdm-tm-class

Course material for the Madrid ASDM class on text mining (C09)
Jupyter Notebook
12
star
8

classipy

A command-line tool to develop advanced text classifiers using SciKit-Learn.
Python
9
star
9

sentence_splitter

check my new spliter - segtok
Python
8
star
10

tokenizer

a concurrent, deterministic finite state tokenizer (for letter-based scripts)
Go
4
star
11

txtfnnl

a UIMA-based text mining pipeline
Java
3
star
12

SPECIES

a modified version of the SPECIES tagger
C++
2
star
13

otplc

A tool to convert corpus annotations between the brat annotation and OTPL formats.
Python
2
star
14

vimrc

my (Vim-centric) POSIX environment
Vim Script
2
star
15

cpp-project-template

A very basic C++ project structure using CMake, Catch2, and cxxopts.
C++
2
star
16

go

Golang source code collection
Go
2
star
17

bceval

BioCreative Evaluation Scripts and Library
Python
2
star
18

bootstrap

jump-start a simple GNU C project
C
2
star
19

lexikos

a minimal acyclic deterministic finite state automaton (MADFA)
Scala
1
star
20

gnamed

a tool to manage a unified repository of gene and protein names, symbols, keywords, literature references, and species associations
Python
1
star
21

word2numpy

A Python 3.0 port of word2vec.py, in itself a Python 2.7 port of word2vec
Python
1
star
22

segmenter

scripts to pre-process plain-text: sentence segmentation, tokenization, and stemming
Perl
1
star
23

OnlineTaggerFramework

an online tagger wrapper for GATE that only spans one global sub-process per processing resource
Java
1
star
24

fnl.github.io

my blog (http://fnl.es)
HTML
1
star
25

ibecs-to-omtd-transformer

A transformer that converts an IBECS XML file into an OMTD-SHARE corpus
Python
1
star
26

chemcheck

a syntax checker for BioCreative IV CHEMDNER task annotations
C
1
star
27

couchpy

a python3 library to programmatically access CouchDB (written when there was none, "long ago"...)
Python
1
star
28

libfsmg

A finite state machine library for pattern matching on generic types in Java sequence containers.
Java
1
star