• Stars
    star
    109
  • Rank 319,077 (Top 7 %)
  • Language
    C++
  • License
    Apache License 2.0
  • Created almost 5 years ago
  • Updated 11 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Anomaly Detection on Time-Evolving Streams in Real-time. Detecting intrusions (DoS and DDoS attacks), frauds, fake rating anomalies.

MSᴛʀᴇᴀᴍ

Implementation of

MSᴛʀᴇᴀᴍ detects group anomalies from a multi-aspect data stream in constant time and memory. We output an anomaly score for each record. MSᴛʀᴇᴀᴍ builds on top of MIDAS to work in a multi-aspect setting such as event-log data, multi-attributed graphs etc.

Demo

  1. Run bash run.sh KDD to compile the code and run it on the KDD dataset.
  2. Run bash run.sh DOS to compile the code and run it on the DOS dataset.
  3. Run bash run.sh UNSW to compile the code and run it on the UNSW dataset.

MSᴛʀᴇᴀᴍ

  1. Change Directory to MSᴛʀᴇᴀᴍ folder cd mstream
  2. Run make to compile code and create the binary
  3. Run ./mstream -n numericalfile -c categoricalfile -t timefile
  4. Run make clean to clean binaries

Command line options

  • -h --help: produce help message
  • -n --numerical: Numerical file name
  • -c --categorical: Categorical file name
  • -c --time: Timestamps file name
  • -o --output: Output file name (default: scores.txt)  
  • -r --rows: Number of Hash Functions (default: 2)  
  • -b --buckets: Number of Buckets (default: 1024)
  • -a --alpha: Temporal Decay Factor (default: 0.6)

Input file format for MSᴛʀᴇᴀᴍ

MSᴛʀᴇᴀᴍ expects the input multi-aspect record stream to be stored in three files:

  1. Numerical file: contains , separated Numerical Features.
  2. Categorical file: contains , separated Categorical Features.
  3. Time File: contains Timestamps.

Both Numerical and Categorical files contain corresponding features of the multi-aspect record. Records should be sorted in non-decreasing order of their time stamps and the column delimiter should be ,

Datasets

  1. KDDCUP99
  2. CICIDS-DoS
  3. UNSW-NB 15
  4. CICIDS-DDoS

Citation

If you use this code for your research, please consider citing our WWW paper.

@inproceedings{bhatia2021mstream,
    title={Fast Anomaly Detection in Multi-Aspect Streams},
    author={Siddharth Bhatia and Arjit Jain and Pan Li and Ritesh Kumar and Bryan Hooi},
    booktitle={The Web Conference (WWW)},
    year={2021}
}