• Stars
    star
    208
  • Rank 189,015 (Top 4 %)
  • Language
    Java
  • License
    GNU Affero Genera...
  • Created almost 11 years ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Palmetto is a quality measuring tool for topics

Maven Build Codacy Badge Codacy Badge

Palmetto

Palmetto is a quality measuring tool for topics

This is the implementation of coherence calculations for evaluating the quality of topics. If you want to learn more about coherence calculations and their meaning for topic evaluation, take a look at the project homepage - especially at the publications.

Palmetto from DICE is licensed under a AGPL v3.0 License.

Please take a look at the the wikipage to read how Palmetto can be used. If you would like to use a different index than the one we are providing, you can create your own index.

If you are using Palmetto for an experiment or something similar that leads to a publication, please cite the paper "Exploring the Space of Topic Coherence Measures" that you can find on the project website. A link to the project website is welcome as well :)

Applicability

The coherence measures implemented with Palmetto mainly built on a reference index. This index is used to derive counts for the calculation of the coehrence values. These values can be used to measure the human interpretability of topics based on the topics' top words. It should be noted that the preprocessing of the index has an influence on the results.

It is highly suggested to use an index that fits to the preprocessing that has been applied to the corpus on which the topics have been generated.

We use an English Wikipedia which has been preprocessed using a Lemmatizer. In practice, this means that word groups with non-lemmatized words may lead to unintuitive results simply because these word forms are underrepresented or even missing in our index (e.g., #57). In these cases, it is recommended to generate an own index.

Directories

The palmetto directory contains the Palmetto library.

The webApp directory contains a web application offering a small demo as well as a web service API for using Palmetto.

Docker

Palmetto can be used as a docker container.

The index should be downloaded and extracted to some path (for example, /path/to/indexes). After extraction, the directory should contain the wikipedia_bd directory and the wikipedia_bd.histogram file.

path
+- to
  +- indexes
    +- wikipedia_bd
    +- wikipedia_bd.histogram

After that, the container can be run the following way:

docker run -p 7777:8080 -d -v /path/to/indexes/:/usr/local/indexes/:ro dicegroup/palmetto-service

After that the demo application can be accessed using http://localhost:7777/.

Adapted Docker image

In case the Palmetto code has been adapted locally, the Docker image can be build with the following command:

make build dockerize

More Repositories

1

gerbil

GERBIL - General Entity annotatoR Benchmark
Java
220
star
2

FOX

Federated Knowledge Extraction Framework
Java
182
star
3

AGDISTIS

AGDISTIS - Agnostic Named Entity Disambiguation
Java
140
star
4

LIMES

Link Discovery Framework for Metric Spaces.
JavaScript
127
star
5

n3-collection

N3 - A Collection of Datasets for Named Entity Recognition and Disambiguation in the NLP Interchange Format
69
star
6

dice-embeddings

Hardware-agnostic Framework for Large-scale Knowledge Graph Embeddings
Python
42
star
7

palmetto-py

Python interface for https://github.com/dice-group/Palmetto
Python
40
star
8

Ontolearn

Ontolearn is an open-source software library for explainable structured machine learning in Python. It learns OWL class expressions from positive and negative examples.
Python
39
star
9

tentris

Tentris is a tensor-based RDF triple store with SPARQL support.
C++
35
star
10

LargeRDFBench

LargeRDFBench: A Billion Triples Benchmark for SPARQL Query Federation
HTML
23
star
11

IGUANA

IGUANA is a benchmark execution framework for querying HTTP endpoints and CLI Applications such as Triple Stores.
Java
23
star
12

Squirrel

Squirrel searches and collects Linked Data
Java
22
star
13

Convolutional-Complex-Knowledge-Graph-Embeddings

Python
19
star
14

TeBaQA

A question answering system which utilises machine learning.
Java
18
star
15

CostFed

Cost-Based Query Optimization for SPARQL Endpoint Federation
Java
17
star
16

dice-website

Knowledge-graph driven website of the DICE research group
MDX
17
star
17

hawk

Hybrid Question Answering (HAWK) -- is going to drive forth the OKBQA vision of hybrid question answering system using Linked Data and full-text information.
HTML
16
star
18

owlapy

OWLAPY is a Python Framework for creating and manipulating OWL Ontologies.
Python
16
star
19

LIdioms

A multilingual linked idioms data set.
16
star
20

QRTool

Annotation tool for manual NLP, esp. NER and NED
Java
13
star
21

hypertrie

A flexible data structure for low-rank (≤ 5), sparse tensors supporting slices by any dimension and Einstein summation (einsum).
C++
13
star
22

COPAAL

Java
12
star
23

LIMES-legacy

Repository of LIMES releases
12
star
24

KG-NMT

Knowledge Graph-augmented NMT
Java
11
star
25

REX

REX is a Web-Scale Extension Framework for RDF knowledge bases.
Java
11
star
26

LD2NL

Linked Data to Natural Language
Java
11
star
27

TAIPAN

Web Tables Automatic Property Mapping
Python
8
star
28

NABU

Multilingual RDF Verbalizer
Python
8
star
29

FactCheck

Java
8
star
30

Tab2Onto

Tab2Onto: Unsupervised Semantification with Knowledge Graph Embeddings
Jupyter Notebook
8
star
31

EuroPython-2018

This repository contains starter code for building a question-answering system over SQuAD.
Jupyter Notebook
8
star
32

GENESIS

GENESIS - A Generic RDF Data Access Interface
JavaScript
8
star
33

Jword2vec

RESTful Web Service for the word2vec models of code.google.com/p/word2vec
Java
7
star
34

RDF2PT

Portuguese Verbalizer from RDF triples to NL sentences and summaries.
Java
7
star
35

PYKE

A Physical Embedding Model for Knowledge Graphs
Jupyter Notebook
7
star
36

Convolutional-Hypercomplex-Embeddings-for-Link-Prediction

Python
7
star
37

JavaOnlineExercises

Java Online Exercises (Jupyter JOE)
Shell
7
star
38

triplestore-benchmarks

An Evaluation of Triplestore Benchmarks
Java
7
star
39

RELD

A Knowledge Graph of Relation Extraction Datasets
Python
7
star
40

ida-pg

Java
6
star
41

OOV-In-Link-Prediction

Impact of out-of-vocabulary (OOV) in link prediction benchmark datasets
Python
6
star
42

LitCQD

Python
6
star
43

CSV2RDF-WIKI

CSV2RDF WIKI
Python
6
star
44

theses

Information about writing student theses at DICE
HTML
6
star
45

BENGAL

Benchmark Generator for Knowledge Extraction
Java
6
star
46

Universal_Embeddings

This repository implements universal embeddings for most URIs on DBpedia and beyond
Jupyter Notebook
6
star
47

Amazons

A server for the amazons game
5
star
48

KBQA-PG

Project Group 2021/2022: Knowledge Base Question Answering
Python
5
star
49

sparql-parser-base

Antlr4 based SPARQL 1.0 and SPARQL 1.1 parsers
ANTLR
5
star
50

joint-model

This repository contains the source code of paper: "Semantic-based End-to-End Learning for Typhoon Intensity Prediction"
Jupyter Notebook
5
star
51

rdf-partitioning

Java
4
star
52

MultPAX

A Multitask Framework for Present and Absent Keyphrase Generation using Knowledge Graphs
Jupyter Notebook
4
star
53

Lemming

LEMMING is an ExaMple MImickiNg graph Generator
Java
4
star
54

fox-java

Java bindings for FOX - Federated Knowledge Extraction Framework
Java
4
star
55

GATES

Graph Attention Networks for Entity Summarization is the model that applies deep learning on graphs and ensemble learning on entity summarization tasks.
Python
4
star
56

sqcframework

The SPARQL Queries Containment Benchmark Generation Framework
4
star
57

TemporalFC

This open-source project contains the Python implementation of our approach TemporalFC. This project is designed to ease real-world applications of fact-checking over knowledge graphs and produce better results.
Python
4
star
58

Shallom

A shallow neural model for relation prediction
Jupyter Notebook
3
star
59

ASSESS

a platform for automatic self-assessment
HTML
3
star
60

Ocelot

On Extracting Relations using Distributional Semantics and a Tree Generalization
Java
3
star
61

SAIM

LIMES GUI
Java
3
star
62

DBpedia-Chatlog-Analysis

discourse analysis for DBpedia chatbot: http://chat.dbpedia.org/
Jupyter Notebook
3
star
63

Leopard

A Baseline Approach to Attribute Prediction and Validation for Knowledge Graph Population.
Java
3
star
64

feasible

FEASIBLE: A Feature-Based SPARQL Benchmark Generation Framework
Java
3
star
65

IDA

Software Campus project repository for Intelligent Data Science Chatbot
Java
3
star
66

sparql-query-tools

Tools to run SPARQL queries and get metrics.
Python
3
star
67

autoindex

Java
2
star
68

rdf-parser

Streaming Parser for RDF Files
C++
2
star
69

fuseki-sample-setup

2
star
70

rdfdetector

Detect RDF serialization format used for a given data stream
Scala
2
star
71

TAIPAN-Datasets

Datasets for benchmarking matching of WebTables to KB and Open Table Extraction.
Python
2
star
72

Tapioca

Linked data search engine
Java
2
star
73

docker-halyard

Shell
2
star
74

Eaglet

Eaglet is an Annotation (NER/NED) GoLd standard chEcking Tool
Java
2
star
75

gatsby-theme-rdfsite

Gatsby theme for RDF-based research project websites
JavaScript
2
star
76

QALD-Generator

Question Answering Over Linked Data Benchmark Generator
2
star
77

Basilisk

Java
2
star
78

MFKC

Java
2
star
79

NEBULA

Python
2
star
80

rdf-3x

C++
2
star
81

COVID19DS

Python
2
star
82

esther

Java
2
star
83

BERT-QA

BERT-based Question Answering Over Linked Data
Python
2
star
84

raki-ilp-benchmark

RAKI ILP Bencharmk integration for HOBBIT
Java
2
star
85

ASSET

A Semi-supervised Approach for Entity Typing in Knowledge Graphs
Python
2
star
86

Cetus

CETUS - Class induction for pre-annotated entities
Java
2
star
87

dice-hash

C++
2
star
88

QUANT

Question Answering Curator
Java
2
star
89

faraday-cage

Framework for Acyclic Directed Graphs Yielding Parallel Computations of Great Efficiency
Java
2
star
90

sask

Projectgroups Search and Extraction
Java
2
star
91

sparql-parser

C++
2
star
92

dice-template-library

This template library is a collection of template-oriented code that we, the Data Science Group at UPB, found pretty handy.
C++
2
star
93

mu-Bench

Microbenchmark Generator for SPARQL
Shell
2
star
94

EvoLearner

EvoLearner: Learning Description Logics with Evolutionary Algorithms
Prolog
2
star
95

NeuralClassExpressionSynthesis

Learning class expressions in DL using neural networks
Jupyter Notebook
2
star
96

embeddings.cc

Universal Knowledge Graph Embeddings
Python
2
star
97

RAKI-Drill-Endpoint

Dockerfile
2
star
98

CoreferenceResolution

Cross-Document Coreference Resolution using Latent Features
Java
2
star
99

IndQNER

Python
2
star
100

virtuoso-docker-compose

This is a small setup to run a virtuoso instance in docker container and load a dataset into it.
Shell
2
star