• Stars
    star
    276
  • Rank 144,429 (Top 3 %)
  • Language
    Jupyter Notebook
  • License
    Apache License 2.0
  • Created over 6 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Exercises for the Analysis of Knowledge Graphs

Programming Exercises for the Analysis of Knowledge Graphs

This is a repository, which allows interested students and researchers to perform hands-on analysis of knowledge graphs. It is primarily developed as part of the knowledge graph analysis lecture of the SDA Group at the University of Bonn. However, the material itself is also useful for anyone else.

Knowledge Graphs - Things, not Strings!

Knowledge graphs represent knowledge in terms of entities and their relationships as shown in the figure below. The nodes of a knowledge graph are the objects which are relevant in your domain and have a unique identifier (so they represent real world "things" rather than just a string label). The edges are the connections between those objects. Since knowledge graphs are intuitive and enjoy a number of benefits, they became very popular over the past decade. Some of the most well known knowledge graphs are the Google Knowledge Graph (a major component of Google Search and other services), DBpedia (a knowledge graph extracted from Wikipedia), Wikidata, YAGO, the Facebook Social Graph, Satori (Microsoft Knowledge Graph) and the LinkedIn Knowledge Graph.

Many knowledge graphs are very large and their creation is crowdsourced and/or they are generated from various sources. Relational learning methods can then be employed on knowledge graphs for a variety of tasks, e.g. link prediction tries to find missing edges in knowledge graphs (e.g. suggesting friends via your social graph is about predicting missing edges to other persons), link correction is about finding incorrect edges, entity resolution is about mapping entities in text to knowledge graphs and clustering groups entities based on their similarity. In the exercises, you will learn about relational learning methods for knowledge graphs.

The two knowledge representation formalisms for knowledge graphs, which are used in the exercises, are RDF knowledge graphs and property graph databases. Since knowledge graphs represent a whole network of entites, the methods to solve the above problems often go beyond simple feature based machine learning. In the exercises, you will learn about the creation of knowledge graph embeddings via tensors and tensor factorisation as well as neural network based techniques. You will also learn about Markov Networks.

knowledge graph example

Exercise Overview

Each individual exercise contains a description of tasks and background. We first start with the formalisms to create an query knowledge graphs and then proceed with relational learning methods.

Contributing and Feedback

Please use the issue tracker for reporting problems and suggesting improvements. Feel free to submit pull requests for improvements of the exercises. Please send other feedback via mail to Prof. Jens Lehmmann.

Authors

License

The repository itself is under Apache License. For the individual libraries and tools used in the exercises, please check their license conditions.

Acknowledgements

We thank the students of the Knowledge Graph Analysis lecture in Bonn as well as the developers of the frameworks we are using for their support in creating this learning resource.

More Repositories

1

DL-Learner

A tool for supervised Machine Learning in OWL and Description Logics
Java
150
star
2

Sparqlify

Sparql -> SQL Rewriter enabling virtual RDB -> RDF mappings
Java
120
star
3

AK-DE-biGRU

Improving Response Selection in Multi-turn Dialogue Systems by Incorporating Domain Knowledge
Python
58
star
4

jena-sparql-api

A collection of Jena-extensions for hiding SPARQL-complexity from the application layer
Java
56
star
5

HORUS-NER

HORUS: A framework to boost NLP tasks
Python
50
star
6

BioKEEN

A computational library for learning and evaluating biological knowledge graph embeddings - please see the main PyKEEN repo at https://github.com/pykeen/pykeen/
Jupyter Notebook
45
star
7

SemWeb2NL

Semantic Web related concepts converted to Natural language
Web Ontology Language
44
star
8

LiteralE

Knowledge Graph Embeddings learned from the structure and literals of knowledge graphs
Python
42
star
9

RdfProcessingToolkit

Command line interface based RDF processing toolkit to run sequences of SPARQL statements ad-hoc on RDF datasets, streams of bindings and streams of named graphs with support for processing JSON, CSV and XML using function extensions
Java
32
star
10

SML-Bench

A Benchmark for Machine Learning from Structured Data
Prolog
21
star
11

OWL2SPARQL

OWL To SPARQL Query Rewriter
Java
20
star
12

MA-INF-4222-NLP-Lab

MA-INF 4222: NLP Lab (University of Bonn)
Jupyter Notebook
19
star
13

KG-Copy_Network

Implementation of the paper: Using a KG-Copy Network for Non-Goal Oriented Dialogues
Python
18
star
14

Jassa-UI-Angular

Angular-JS based user interface components for Jassa
JavaScript
15
star
15

Polisis_Benchmark

Reproducing state-of-the-art results
Python
15
star
16

linked-uspto-patent-data

Java
11
star
17

lodservatory

Public SPARQL Endpoint Service Monitoring
Shell
11
star
18

dcat-suite

Semantic Web library and tool for retrieval and deployment of data from/to GIT, CKAN, MAVEN repos and triple stores using DCAT as the backbone.
Java
10
star
19

kgirnet

Scripts for KGIRNet model for ESWC
Python
10
star
20

MA-INF-4223-DBDA-Lab

Repository for Lab “Distributed Big Data Analytics” (MA-INF 4223), University of Bonn
Jupyter Notebook
10
star
21

SDA-README

Links to SDA Github organisations - visit those if you want to see all our projects
9
star
22

Wikipedia_TF_IDF_Dataset

Pre-computed IDF stats over all EN Wiki articles
9
star
23

DL-Learner-Protege-Plugin

A Protégé plugin for the DL-Learner framework
Java
9
star
24

OpenResearch

Public issue system for OPENRESEARCH/ConfIDent
Python
8
star
25

R2RLint

An RDB2RDF quality assessment tool
Java
7
star
26

minds

MINDS - Maths INsiDe SPARQL
Python
5
star
27

POEM

A package of training and evaluating multimodal knowledge graphs embedding models.
Python
5
star
28

ORE

Ontology Repair and Enrichment
Java
5
star
29

proxy_indicators

Jupyter Notebook
4
star
30

SubgraphIsomorphismIndex

An index data structure for fast isomorphic subset / subgraph queries
Java
4
star
31

codeCAI

Python
4
star
32

sda.tech

a linked data driven web page rendered by Jekyll-RDF
TeX
4
star
33

TagMap

Implementation of an Index Data Structure for Fast Subset and Superset Queries (based on the paper by Iztok Savnik)
Java
3
star
34

BigDataOcean-Harmonization

Tool for harmonization of datasets in BigDataOcean
Java
3
star
35

Embeddable-BSBM

Fork of the BSBM source code intended for closer integration in software projects
Java
3
star
36

BigDataOcean-LOV

BigDataOcean Metadata Repository (based on LOV)
HTML
3
star
37

qelos-core

Pytorch utilities
Python
3
star
38

EULAide

Interpretation of an EULA (End-User License Agreement) for the benefit of end-user
Roff
3
star
39

aksw-commons

A collection of utilities and micro frameworks with as little dependencies as possible. For the cases where Guava isn't enough.
Java
3
star
40

BlankNodeSurvey

Survey and accompanying toolkit for analyzing how triple stores support references to blank nodes
Shell
2
star
41

ARCANA

Large Scale Quality Assessment For Potential Dual Use with Spark - Master Thesis Project
Scala
2
star
42

KUPP

A Python package for preprocessing a knowledge graph.
Python
2
star
43

transformers_dialogue_evaluators

Resources to reproduce the results reported in the paper: "Language Model Transformers as Evaluators for Open-domain Dialogues".
Jupyter Notebook
2
star
44

Climate-Bot

This repo includes the data and code for the demo paper titled "Climate Bot: A Machine Reading Comprehension System for Climate Change Question Answering"
1
star
45

Conjure

A declarative approach to conjure RDF datasets from RDF datasets using SPARQL with caching of repeated operations.
Java
1
star
46

lascar.sda.tech

Workshop on Large Scale RDF Analytics - LASCAR
HTML
1
star
47

MA-INF-4221-NLP-Seminar

MA-INF 4221: NLP Seminar (University of Bonn)
1
star
48

iana-language-subtag-registry-rdf

Jena plugin to RDFize the iana subtag registry and validate language tags against it
Java
1
star
49

dialogue

shared repo for collected dialogue tools
Python
1
star
50

Beast

Benchmarking, Evaluation, and Analysis Stack - A powerful yet lightweight Java8/Jena-based RDF processing stack.
Java
1
star
51

SDA-Publications

TeX
1
star
52

KEEN-Model-Zoo

A model zoo for the KEEN Universe
Python
1
star
53

MoMatch

MoMatch (Multilingual Ontology Matching)
Scala
1
star