• Stars
    star
    410
  • Rank 102,072 (Top 3 %)
  • Language
    Python
  • License
    BSD 3-Clause "New...
  • Created almost 8 years ago
  • Updated about 4 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An open-source tool for sequence learning in NLP built on TensorFlow.

  Ape is not a monkey Neural Monkey

Neural Sequence Learning Using TensorFlow

Build Status Documentation Status

The Neural Monkey package provides a higher level abstraction for sequential neural network models, most prominently in Natural Language Processing (NLP). It is built on TensorFlow. It can be used for fast prototyping of sequential models in NLP which can be used e.g. for neural machine translation or sentence classification.

The higher-level API brings together a collection of standard building blocks (RNN encoder and decoder, multi-layer perceptron) and a simple way of adding new building blocks implemented directly in TensorFlow.

Usage

neuralmonkey-train <EXPERIMENT_INI>
neuralmonkey-run <EXPERIMENT_INI> <DATASETS_INI>
neuralmonkey-server <EXPERIMENT_INI> [OPTION] ...
neuralmonkey-logbook --logdir <EXPERIMENTS_DIR> [OPTION] ...

Installation

  • You need Python 3.6 (or higher) to run Neural Monkey.

  • When using virtual environment, execute these commands to install the Python dependencies:

    $ source path/to/virtualenv/bin/activate
    
    # For GPU-enabled version
    (virtualenv)$ pip install --upgrade -r requirements-gpu.txt
    
    # For CPU-only version
    (virtualenv)$ pip install --upgrade -r requirements.txt
  • If you are using the GPU version, make sure that the LD_LIBRARY_PATH environment variable points to lib and lib64 directories of your CUDA and CuDNN installations. Similarly, your PATH variable should point to the bin subdirectory of the CUDA installation directory.

  • If the training crashes on an unknown dependency, just install it with pip. Remember to keep your virtual environment up-to-date with the package requirements file, which may be changed over time. To update the dependencies, re-run the pip install command from above (pay attention to the distinction between GPU and non-GPU versions).

Getting Started

There is a tutorial that you can follow, which gives you the overwiev of how to design your experiments with Neural Monkey.

Package Overview

  • bin: Directory with neuralmonkey executables

  • examples: Example configuration files for ready-made experiments

  • lib: Third party software

  • neuralmonkey: Python package files

  • scripts: Directory with tools that may come in handy. Note dependencies for these tools may not be listed in the project requirements.

  • tests: Test files

Documentation

You can find the API documentation of this package here. The documentation files are generated from docstrings using autodoc and Napoleon extensions to the Python documentation package Sphinx. The docstrings should follow the recommendations in the Google Python Style Guide. Additional details on the docstring formatting can be found in the Napoleon documentation as well.

Related projects

  • tflearn – a more general and less abstract deep learning toolkit built over TensorFlow

  • nlpnet – deep learning tools for tagging and parsing

  • NNBlocks – a library build over Theano containing NLP specific models

  • Nematus - A tool for training and running Neural Machine Translation models

  • seq2seq - a general-purpose encoder-decoder framework for Tensorflow

  • OpenNMT - open sourcce NMT in Torch

Citation

If you use the tool for academic purporses, please consider citing the following paper:

@article{NeuralMonkey:2017,
    author = {Jind{\v{r}}ich Helcl and Jind{\v{r}}ich Libovick{\'{y}}},
    title = {{Neural Monkey: An Open-source Tool for Sequence Learning}},
    journal = {The Prague Bulletin of Mathematical Linguistics},
    year = {2017},
    address = {Prague, Czech Republic},
    number = {107},
    pages = {5--17},
    issn = {0032-6585},
    doi = {10.1515/pralin-2017-0001},
    url = {http://ufal.mff.cuni.cz/pbml/107/art-helcl-libovicky.pdf}
}

License

The software is distributed under the BSD License.

More Repositories

1

whisper_streaming

Whisper realtime streaming for long speech-to-text transcription and translation
Python
879
star
2

udpipe

UDPipe: Trainable pipeline for tokenizing, tagging, lemmatizing and parsing Universal Treebanks and other CoNLL-U files
C++
344
star
3

acl2019_nested_ner

Source code for paper Neural Architectures for Nested NER through Linearization
Python
91
star
4

unilib

Embeddable C++17 Unicode library offering UTF encodings, general category info, simple and full casing, normalization forms, and combining marks stripping.
C++
73
star
5

morphodita

MorphoDiTa: Morphologic Dictionary and Tagger
C++
65
star
6

public-license-selector

Tool that will help you select the right open license for your data or software
CoffeeScript
52
star
7

perin

PERIN is Permutation-Invariant Semantic Parser developed for MRP 2020
Python
44
star
8

nametag

NameTag: Named Entity Tagger
C++
38
star
9

mtmonkey

Distributed infrastructure for Machine Translation web services (using Moses, Python, JSON-RPC/web interface)
Python
33
star
10

treex

Treex NLP framework
Perl
33
star
11

npfl114

Materials for the Deep Learning -- ÚFAL course NPFL114
Python
29
star
12

npfl129

NPFL129 repository
Python
29
star
13

lindat-translation

Frontend of LINDAT translation service
Python
25
star
14

augpt

DSTC9 Submission
Python
18
star
15

korektor

Statistical spell- and (occasional) grammar-checker.
C++
17
star
16

npfl117

Deep Learning Seminar -- ÚFAL course NPFL117
17
star
17

multilexnorm2021

MultiLexNorm 2021 competition system from ÚFAL
Python
15
star
18

parsito

Parsito: Fast non-projective transition-based dependency parser
C++
14
star
19

npfl122

NPFL122 repository
Python
13
star
20

microrestd

MicroRestD is a small C++11 cross-platform REST server built on top of libmicrohttpd http://www.gnu.org/software/libmicrohttpd/.
C++
13
star
21

low-resource-gec-wnut2019

Source code for paper Grammatical Error Correction in Low-Resource Scenarios (W-NUT 2019)
Python
11
star
22

pytreex

A minimal Python implementation of the Treex API
Python
8
star
23

linpipe

LinPipe: Multilingual Processing Tool
C
8
star
24

nlgi_eval

NLI evaluation for NLG
Python
8
star
25

chu_liu_edmonds

Chu-Liu-Edmonds maximum spanning algorithm from TurboParser for use within Python
C++
7
star
26

marian-tensorboard

a simple tool to parse marian training logs and display them in tensorboard
Python
7
star
27

sigmorphon2019

UFAL-Prague entry to the Sigmorphon 2019 Shared Task 2
Python
6
star
28

hamledt

Makefiles, scenarios and support scripts for the development of HamleDT within the Treex infrastructure
Makefile
6
star
29

lindat-repository-obsolete

LINDAT/CLARIN repository for linguistics (http://lindat.cz)
Java
6
star
30

wnut2021_character_transformations_gec

The code from the paper Character Transformations for Non-Autoregressive GEC Tagging
Python
6
star
31

charles-translator-web-frontend

Charles Translator: MT from Charles University
TypeScript
6
star
32

clarin-sp-aaggregator

PHP
5
star
33

mrpipe-conll2019

ÚFAL MRPipe submission to CoNLL 2019 shared task
Python
5
star
34

slimd

SliMD presentation system based on Markdown and HTML5&js.
JavaScript
5
star
35

universal-segmentations

Build scripts for the UniSegments collection of morphologically segmented lexicons for many languages
Python
5
star
36

UFAL_poster

Latex repository for a poster design
TeX
4
star
37

bert-diacritics-restoration

Repository storing code and data for our paper "Diacritics Restoration using BERT with Analysis on Czech language".
Python
4
star
38

MLASK

EACL 2023 paper "MLASK: Multimodal Summarization of Video-based News Articles"
Python
4
star
39

conll2017

CoNLL 2017 Shared Task Proposal: UD End-to-End parsing
Perl
3
star
40

correctable-lecture-translator

A system for live lecture translation (speech to text) where the audience can easily provide corrections.
Python
3
star
41

wiki-error-corpus

Scripts for extracting errors from Wikipedia revisions
Python
3
star
42

weighteddist

A tiny toolkit for weighted word/character edit distance, including cost estimation.
C
3
star
43

thesis_info

ÚFAL Thesis Information Repository
Python
3
star
44

rg

ÚFAL Reading Group
3
star
45

perl-pmltq

Query engine and query language for trees in PML format
Perl
3
star
46

rh_nntagging

Reading Hackathon -- NN Tagging Project
Python
3
star
47

perl-pmltq-server

Refactored and simplified PMLTQ::CGI
Perl
3
star
48

pcedt2.0-coref

Coreference extension to Prague Czech-English Dependency Treebank 2.0
Makefile
3
star
49

kazitext

Python
3
star
50

corefud-scorer

Coreference and anaphora scorer for CorefUD data
Python
3
star
51

quickjudge

A handy tool for quick manual evaluation of line-oriented outputs, e.g. of machine translation.
Perl
3
star
52

optimal-reference-translations

Python
3
star
53

teitok-tools

Conversion tools to and from the TEITOK TEI/XML format
Perl
2
star
54

conll2018

CoNLL 2018 UD Shared Task
Perl
2
star
55

phd-thesis-template

A template PhD thesis at UFAL
TeX
2
star
56

charles-translator-android

Android app of LINDAT translation service
Kotlin
2
star
57

crac2023-corpipe

ÚFAL CorPipe: CRAC 2023 Winning System for Multilingual Coreference Resolution
Python
2
star
58

qtleap

QTLeap Pilot MT systems using TectoMT
Perl
2
star
59

PDT-C

Consolidated Czech PDT-style annotated corpus; consists of PDT, Czech part of PCEDT, PDTSC, PDT-Faust
2
star
60

npfl101

Repository of the seminar NPFL101 Competing in Machine Translation.
Shell
2
star
61

lindat-corpora-conversions

LINDAT Corpora Conversions
Python
2
star
62

lindat-aai-attributes

Parse shibboleth logs for important information about attributes from IdPs and other
XSLT
2
star
63

ufal-tools

Perl
2
star
64

js-treex-view

Javascript library for visualizing Treex files
JavaScript
2
star
65

deltacorpus

Delexicalized tagging and parsing.
Python
2
star
66

cpp_builtem

C++ Builtem is a cross-platform Makefile-based build system for C++11
Shell
2
star
67

ambiguity-grammaticality-complexity

Code for the paper Sentence Ambiguity, Grammaticality and Complexity Probes
Python
2
star
68

lindat-common

Common files and branding for Lindat projects
JavaScript
2
star
69

uk-cs-data-scripts

Scripts for processing data for Czech-Ukrainian MT
Python
2
star
70

crac2022-corpipe

ÚFAL CorPipe: CRAC 2022 Winning System for Multilingual Coreference Resolution
Python
2
star
71

lindat_piwik_reports

Cashing important counts from PIWIK periodically and creating customized reports for LINDAT/CLARIN
JavaScript
2
star
72

eyetracked-multi-modal-translation

EMMT (Eyetracked Multi-Modal Translation), a simultaneous eye-tracking, 4-electrode EEG and audio corpus for multi-modal reading and translation scenarios
2
star
73

errant_czech

Python
2
star
74

UFAL_MT_service

Python
1
star
75

mrptask

Perl
1
star
76

lindat-aai-discovery

HTML
1
star
77

pyclarindspace

Python package using clarin-dspace API
Jupyter Notebook
1
star
78

ParCzech

ParCzech is a project on compiling Czech parliamentary data into annotated corpora.
GLSL
1
star
79

theaitrobot

THEaiTRE bot
Python
1
star
80

auto-hume

Semantic MT metric trained on HUME annotations
Python
1
star
81

bilingual-abstracts-corpus

Bilingual corpus of scientific abstracts from ÚFAL Charles University publications.
Python
1
star
82

continuous-rating

PHP
1
star
83

tamiltb

Makefile
1
star
84

nmt-pe-effects-2021

Experiment relating NMT quality and post-editing efforts
Jupyter Notebook
1
star
85

MTEQA

Python
1
star
86

cpp_utils

UFAL C++ Utils
C++
1
star
87

europarlmin

Corpus of European Parliament debates organized as a corpus for meeting summarization, i.e. matching full transcripts and minutes from the sessions. Used in the shared task of AutoMin 2023.
1
star
88

pmltq-cgi

PMLTQ::CGI has been removed from PMLTQ module in order to decrease number of dependencies. It should be installed separately.
Perl
1
star
89

qsubmit

A wrapper over various grid submission scripts
Python
1
star
90

ker

Simple Czech and English keyword extractor
Python
1
star
91

diaser

Python
1
star
92

treex-web

Online interface for Treex
JavaScript
1
star
93

NPFL095

web of the course "Modern Methods in Computational Linguistics"
1
star
94

wembedding_service

TF2 service for word embeddings computation
Python
1
star
95

olimpic-icdar24

Practical End-to-End Optical Music Recognition for Pianoform Music
Python
1
star
96

automin-2023-data

AutoMin 2023 Training and Test Data
1
star
97

npfl139

Materials for Deep Reinforcement Learning – ÚFAL course NPFL139
Python
1
star