• Stars
    star
    121
  • Rank 293,924 (Top 6 %)
  • Language
    TeX
  • Created over 11 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

My Ph.D. thesis on Outlier Selection and One-Class Classification

Outlier Selection and One-Class Classification

You can read my PhD thesis online or download it as PDF (~10MB).

What is common in a terrorist attack, a forged painting, and a rotten apple? The answer is: all three are anomalies; they are real-world observations that deviate from what is considered to be normal. Detecting anomalies is of utmost importance because an undetected anomaly can be dangerous or expensive. A human domain expert may suffer from three cognitive limitations: fatigue, information overload, and emotional bias. The cognitive limitations will hamper the detection of anomalies. Outlier-selection and one-class classification algorithms are capable of automatically classifying data points as outliers in large amounts of data. In this thesis we study to what extent outlier-selection and one-class classification algorithms can support domain experts with real-world anomaly detection.

Figures

The figures in the thesis are created using Python, MATLAB and TikZ. The TikZ code of the figures can be found in /figures/tikz. To compile all the figures to PDF, I wrote a script called tikz2pdf.

$ tikz2pdf figures/tikz/*.tikz --template figures/thesis-template.tex --output figures/pdf/

Below are some figures from the thesis. Please note that these are rendered with a different font. Also, the conversion from PDF to PNG with ImageMagick isn't all that great.

figures/tikz/bg-banana-roc.tikz

Example 1

figures/tikz/bg-multiclass.tikz

Example 2

figures/tikz/eval-boxplot-preprocessing-pca.tikz

Example 3

figures/tikz/mlc-mapping-auc-overview.tikz

Example 4

figures/tikz/sos-densities.tikz

Example 5

figures/tikz/sos-graph-matlab-binding.tikz

Example 6

figures/tikz/sos-graphs-sample.tikz

Example 7

figures/tikz/sos-nemenyi.tikz

Example 8

More Repositories

1

data-science-at-the-command-line

Data Science at the Command Line
HTML
3,735
star
2

scikit-sos

A Python implementation of the Stochastic Outlier Selection algorithm
Python
91
star
3

dsutils

Command-line tools for doing data science
Shell
90
star
4

rush

R One-Liners from the Shell
R
55
star
5

tmuxr

Manage tmux from R
R
52
star
6

datascienceworkshops-dockerfiles

Various Dockerfiles that we use in our workshops
Dockerfile
49
star
7

python-polars-the-definitive-guide

Scripts and datasets for the O'Reilly book Python Polars: The Definitive Guide
Jupyter Notebook
47
star
8

tikz2pdf

Compile TikZ figures to PDF
Python
37
star
9

raylibr

R package that wraps Raylib, a simple and easy-to-use library to enjoy videogames programming
R
32
star
10

sample

Filter lines from standard input according to some probability, with a given delay, and for a certain duration.
Python
23
star
11

r4ds-python-plotnine

A translation of the visualisation chapters from "R for Data Science" to Python using Plotnine and Pandas.
Jupyter Notebook
22
star
12

i3-wm-scripts

Various scripts for the i3 window manager to allow for renaming workspaces
Shell
14
star
13

poor-mans-parallel-pipelines

Poor Man's Parallel Pipelines (run presentation with `mdp README.md` # https://github.com/visit1985/mdp)
9
star
14

knitractive

A knitr engine to simulate interactive sessions
R
9
star
15

rexpect

Automate Interactive Applications in R
R
9
star
16

turning-polars-dataframes-into-pretty-pictures-and-great-tables

Turning Polars DataFrames into Pretty Pictures and Great Tables
Jupyter Notebook
8
star
17

jeroenjanssens.github.io

My personal website
Jupyter Notebook
6
star
18

mendeley-file

Open files associated with a Mendeley citation
Python
6
star
19

tidytree

A tidy implementation of a Decision Tree Classifier
R
4
star
20

tidynaivebayes

A tidy implementation of the Naive Bayes classifier
R
4
star
21

lof-loci-occ

One-class classification implementations of LOF and LOCI
MATLAB
3
star
22

anscombes-quartet

Seeing Beyond Statistics: Anscombe's Quartet and the Power of Graphs
Jupyter Notebook
3
star
23

saveR

Save MATLAB workspace variables to an R data file
MATLAB
2
star
24

json2s3

Upload and partition a JSON stream to AWS S3
Python
2
star
25

olt-vhddwp

OLT: Visualizing High-Dimensional Data with Python
Jupyter Notebook
1
star
26

dog

Fetch missing man pages.
Python
1
star
27

qbasic-annihilator

An action-packed game written in QBasic.
1
star
28

presto

Presto: A Poseidon research tool to create artificial vessel trajectories
Java
1
star