• Stars
    star
    560
  • Rank 79,541 (Top 2 %)
  • Language
    Python
  • License
    Apache License 2.0
  • Created over 4 years ago
  • Updated 3 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

A repository of curated datasets from various attacks

A Repository of curated datasets from various attacks to:

  • Easily develop detections without having to build an environment from scratch or simulate an attack.
  • Test detections, specifically Splunks Security Content
  • Replay into streaming pipelines for validating your detections in your production SIEM

Installation

GitHub LFS is used in this project. For Mac users git-lfs can be derived with homebrew (for another OS click here):

brew install git-lfs

Then you need to install it. I would recommend using the --skip-smudge parameter, which will avoid that all Git LFS files are downloaded during git clone. You can install it with the following command:

git lfs install --skip-smudge

Download the repository with this command:

git clone [email protected]:splunk/attack_data.git

Fetch all or select attack data sets

# This pulls all data - Warning >9Gb of data
git lfs pull

# This pulls one data set directory
git lfs pull --include=datasets/attack_techniques/T1003.001/atomic_red_team/

# Or pull just one log like this
git lfs pull --include=datasets/attack_techniques/T1003.001/atomic_red_team/windows-sysmon.log

Anatomy of a Dataset 🧬

Datasets

Datasets are defined by a common YML structure. The structure has the following fields:

field description
id UUID of dataset
name name of author
date last modified date
dataset array of URLs where the hosted version of the dataset is located
description describes the dataset as detailed as possible
environment markdown filename of the environment description see below
technique array of MITRE ATT&CK techniques associated with dataset
references array of URLs that reference the dataset
sourcetypes array of sourcetypes that are contained in the dataset

For example

id: 405d5889-16c7-42e3-8865-1485d7a5b2b6
author: Patrick Bareiss
date: '2020-10-08'
description: 'Atomic Test Results: Successful Execution of test T1003.001-1 Windows
  Credential Editor Successful Execution of test T1003.001-2 Dump LSASS.exe Memory
  using ProcDump Return value unclear for test T1003.001-3 Dump LSASS.exe Memory using
  comsvcs.dll Successful Execution of test T1003.001-4 Dump LSASS.exe Memory using
  direct system calls and API unhooking Return value unclear for test T1003.001-6
  Offline Credential Theft With Mimikatz Return value unclear for test T1003.001-7
  LSASS read with pypykatz '
environment: attack_range
technique:
- T1003.001
dataset:
- https://media.githubusercontent.com/media/splunk/attack_data/master/datasets/attack_techniques/T1003.001/atomic_red_team/windows-powershell.log
- https://media.githubusercontent.com/media/splunk/attack_data/master/datasets/attack_techniques/T1003.001/atomic_red_team/windows-security.log
- https://media.githubusercontent.com/media/splunk/attack_data/master/datasets/attack_techniques/T1003.001/atomic_red_team/windows-sysmon.log
- https://media.githubusercontent.com/media/splunk/attack_data/master/datasets/attack_techniques/T1003.001/atomic_red_team/windows-system.log
references:
- https://attack.mitre.org/techniques/T1003/001/
- https://github.com/redcanaryco/atomic-red-team/blob/master/atomics/T1003.001/T1003.001.md
- https://github.com/splunk/security-content/blob/develop/tests/T1003_001.yml
sourcetypes:
- XmlWinEventLog:Microsoft-Windows-Sysmon/Operational
- WinEventLog:Microsoft-Windows-PowerShell/Operational
- WinEventLog:System
- WinEventLog:Security

Environments

Environments are a description of where the dataset was collected. At this moment there are no specific restrictions, although we do have a simple template a user can start with here. The most common environment for most datasets will be the attack_range since this is the tool that used to generate attack data sets automatically.

Replay Datasets πŸ“Ό

Most datasets generated will be raw log files. There are two main simple ways to ingest it.

Into Splunk

using replay.py

pre-requisite, clone, create virtual env and install python deps:

git clone [email protected]:splunk/attack_data.git
cd attack_data
pip install virtualenv
virtualenv venv
source venv/bin/activate
pip install -r bin/requirements.txt
  1. Download dataset
  2. configure bin/replay.yml
  3. run python bin/replay.py -c bin/replay.yml
using UI
  1. Download dataset
  2. In Splunk enterprise , add data -> Files & Directories -> select dataset
  3. Set the sourcetype as specified in the YML file
  4. Explore your data

See a quick demo πŸ“Ί of this process here.

Into DSP

To send datasets into DSP the simplest way is to use the scloud command-line-tool as a requirement.

  1. Download the dataset
  2. Ingest the dataset into DSP via scloud command `cat attack_data.json | scloud ingest post-events --format JSON
  3. Build a pipeline that reads from the firehose and you should see the events.

Contribute Datasets πŸ₯°

  1. Generate a dataset
  2. Under the corresponding MITRE Technique ID folder create a folder named after the tool the dataset comes from, for example: atomic_red_Team
  3. Make PR with <tool_name_yaml>.yml file under the corresponding created folder, upload dataset into the same folder.

See T1003.002 for a complete example.

Note the simplest way to generate a dataset to contribute is to launch your simulations in the attack_range, or manually attack the machines and when done dump the data using the dump function.

See a quick demo πŸ“Ί of the process to dump a dataset here.

To contribute a dataset simply create a PR on this repository, for general instructions on creating a PR see this guide.

Automatically generated Datasets βš™οΈ

This project takes advantage of automation to generate datasets using the attack_range. You can see details about this service on this sub-project folder attack_data_service.

Author

License

Copyright 2023 Splunk Inc.

Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

More Repositories

1

attack_range

A tool that allows you to create vulnerable instrumented local or cloud environments to simulate attacks against and collect the data into Splunk
Jinja
2,118
star
2

security_content

Splunk Security Content
Python
1,235
star
3

splunk-sdk-python

Splunk Software Development Kit for Python
Python
649
star
4

docker-splunk

Splunk Docker GitHub Repository
Python
410
star
5

splunk-ansible

Ansible playbooks for configuring and managing Splunk Enterprise and Universal Forwarder deployments
Python
355
star
6

eventgen

Splunk Event Generator: Eventgen
Python
354
star
7

botsv2

Splunk Boss of the SOC version 2 dataset.
348
star
8

splunk-connect-for-kubernetes

Helm charts associated with kubernetes plug-ins
Python
344
star
9

docker-splunk-legacy

Docker Splunk *** LEGACY IMAGES - PLEASE SEE https://github.com/splunk/docker-splunk INSTEAD ***
Shell
304
star
10

botsv1

302
star
11

pion

Pion Network Library (Boost licensed open source)
C++
299
star
12

splunk-operator

Splunk Operator for Kubernetes
Go
205
star
13

splunk-sdk-javascript

Splunk Software Development Kit for JavaScript
JavaScript
185
star
14

botsv3

Splunk Boss of the SOC version 3 dataset.
163
star
15

melting-cobalt

A Cobalt Strike Scanner that retrieves detected Team Server beacons into a JSON object
Python
163
star
16

qbec

configure kubernetes objects on multiple clusters using jsonnet
Go
157
star
17

splunk-connect-for-syslog

Splunk Connect for Syslog
Python
152
star
18

splunk-sdk-java

Splunk Software Development Kit for Java
Java
138
star
19

splunk-library-javalogging

Splunk logging appenders for popular Java Logging frameworks
Java
131
star
20

ansible-role-for-splunk

Splunk@Splunk's Ansible role for installing Splunk, upgrading Splunk, and installing apps/addons on Splunk deployments (VM/bare metal)
Jinja
131
star
21

attack_range_local

Build a attack range in your local machine
Jinja
129
star
22

splunk-platform-automator

Ansible framework providing a fast and simple way to spin up complex Splunk environments.
Python
117
star
23

SA-ctf_scoreboard

Python
116
star
24

splunk-aws-cloudformation

AWS CloudFormation templates for Splunk distributed cluster deployment
Shell
108
star
25

terraform-provider-splunk

Terraform Provider for Splunk
Go
103
star
26

securitydatasets

Home for Splunk security datasets.
97
star
27

splunk-aws-project-trumpet

Python
95
star
28

splunk-app-examples

App examples for Splunk Enterprise
JavaScript
93
star
29

splunk-demo-collector-for-analyticsjs

Example Node.js based backend collector for client-side data
JavaScript
93
star
30

vscode-extension-splunk

Visual Studio Code Extension for Splunk
Python
86
star
31

observability-workshop

To get started, please proceed to The Splunk Observability Cloud Workshop Homepage.
HTML
86
star
32

mltk-algo-contrib

Python
85
star
33

fluent-plugin-splunk-hec

This is the Fluentd output plugin for sending events to Splunk via HEC.
Ruby
83
star
34

network-explorer

C++
82
star
35

kafka-connect-splunk

Kafka connector for Splunk
Java
82
star
36

splunk-javascript-logging

Splunk HTTP Event Collector logging interface for JavaScript
JavaScript
81
star
37

splunk-reskit-powershell

Splunk Resource Kit for Powershell
PowerShell
80
star
38

corona_virus

This project includes an app that allows users to visualize and analyze information about COVID-19 using data made publicly-available by Johns Hopkins University. For more information on legal disclaimers, please see the README.
Python
79
star
39

contentctl

Splunk Content Control Tool
Python
77
star
40

salo

Synthetic Adversarial Log Objects: A Framework for synthentic log generation
Python
75
star
41

ShellSweep

ShellSweeping the evil.
PowerShell
73
star
42

docker-itmonitoring

Get Started with Streaming your Docker Logs and Stats in Splunk!
HTML
68
star
43

splunk-sdk-csharp-pcl

Splunk's next generation C# SDK
C#
65
star
44

docker-logging-plugin

Splunk Connect for Docker is a Docker logging plugin that allows docker containers to send their logs directly to Splunk Enterprise or a Splunk Cloud deployment.
Go
64
star
45

attack-detections-collector

Collects a listing of MITRE ATT&CK Techniques, then discovers Splunk ESCU detections for each technique
Python
59
star
46

splunk-aws-serverless-apps

Splunk AWS Serverless applications and Lambda blueprints
JavaScript
55
star
47

splunk-webframework

Splunk Web Framework
Python
51
star
48

splunk-app-splunkgit

GitHub App
Python
49
star
49

vault-plugin-secrets-gitlab

Vault Plugin for Gitlab Project Access Token
Go
48
star
50

pytest-splunk-addon

A Dynamic test tool for Splunk Technology Add-ons
Python
47
star
51

splunk-mltk-container-docker

Splunk App for Data Science and Deep Learning - container images repository
Jupyter Notebook
47
star
52

rba

RBA is Splunk's method to aggregate low-fidelity security events as interesting observations tagged with security metadata to create high-fidelity, low-volume alerts.
44
star
53

splunk-cloud-sdk-go

The Splunk Cloud SDK for Go, contains libraries for building apps for the Splunk Cloud Services Platform.
Go
43
star
54

splunk-app-testing

sample app along with a CICD pipeline for testing multiple versions of splunk
Shell
42
star
55

rwi_executive_dashboard

Splunk Remote Work Insights - Executive Dashboard
HTML
38
star
56

splunk-sdk-ruby

Splunk Software Development Kit for Ruby
Ruby
36
star
57

splunk-shuttl

Splunk app for archive management, including HDFS support.
Java
35
star
58

attack_range_cloud

Attack Range to test detection against nativel serverless cloud services and environments
Python
35
star
59

addonfactory-ucc-generator

A framework to generate UI-based Splunk Add-ons.
Python
34
star
60

splunk-for-securityHub

Python
34
star
61

azure-functions-splunk

Azure Functions for getting data in to Splunk
JavaScript
30
star
62

dashboard-conf19-examples

Splunk new dashboard framework examples .conf 2019
JavaScript
30
star
63

github_app_for_splunk

A collection of dashboards and knowledge objects for Github data
JavaScript
29
star
64

splunk-connect-for-snmp

Python
28
star
65

twinclams

because twin clams are better than one clam?
Python
27
star
66

jupyterhub-istio-proxy

JupyterHub proxy implementation for kubernetes clusters running istio service mesh
Go
27
star
67

observability-content-contrib

Contribution repository for Splunk Observability Content (e.g. Dashboards, Detectors, Examples, etc)
HCL
26
star
68

lightproto

Protobuf compatible code generator
Java
26
star
69

splunk-app-twitter

Twitter application for Splunk
Python
25
star
70

splunk-library-dotnetlogging

Support for logging from .NET Tracing and ETW / Semantic Logging ApplicationBlock to Splunk.
C#
25
star
71

splunkrepl

An awesome little REPL for issuing SPLUNK queries
JavaScript
24
star
72

fluent-plugin-kubernetes-objects

This is the Fluentd input plugin which queries Kubernetes API to collect Kubernetes objects (like Nodes, Namespaces, Pods, etc.)
Ruby
23
star
73

splunk-ref-pas-code

Splunk Reference App - Pluggable Auditing System (PAS) - Code Repo
Python
22
star
74

vault-plugin-splunk

Vault plugin to securely manage Splunk admin accounts and password rotation
Go
22
star
75

splunk-sdk-php

Splunk Software Development Kit for PHP
PHP
22
star
76

splunk-heatwave-viz

A heatmap vizualization of bucketed ranged data over time.
JavaScript
21
star
77

pipelines

Concurrent processing pipelines in Go.
Go
21
star
78

splunk-gcp-functions

Python
20
star
79

PEAK

Security Content for the PEAK Threat Hunting Framework
Jupyter Notebook
20
star
80

splunk-tableau-wdc

Splunk Tableau Web Data Connector (WDC) Example
JavaScript
20
star
81

splunkforjenkins

Java
19
star
82

splunk-3D-graph-network-topology-viz

Plot relationships between objects with force directed graph based on ThreeJS/WebGL.
JavaScript
19
star
83

minecraft-app

Splunking Minecraft with the App Framework
JavaScript
19
star
84

splunk-add-on-jira-alerts

Splunk custom alert action for Atlassian JIRA
Python
19
star
85

terraform-provider-scp

Splunk Terraform Provider to manage config resources for Splunk Cloud Platform
Go
18
star
86

splunk-bunyan-logger

A Bunyan stream for Splunk's HTTP Event Collector
JavaScript
18
star
87

slack-alerts

Splunk custom alert action for sending messages to Slack channels
Python
18
star
88

public-o11y-docs

Splunk Observability Cloud docs
HTML
18
star
89

dashpub

Generate next.js apps to publish Splunk dashboards
JavaScript
18
star
90

vale-splunk-style-guide

Splunk Style Guide for the Vale linter
18
star
91

SA-ctf_scoreboard_admin

Python
18
star
92

acs-privateapps-demo

Demo of private-apps ci/cd integration into splunkcloud using the admin config service
Go
17
star
93

splunk-cloud-sdk-python

The Splunk Cloud SDK for Python, contains libraries for building apps for the Splunk Cloud Services Platform.
Python
17
star
94

fabric-logger

Logs blocks, transactions and events from Hyperledger Fabric to Splunk.
TypeScript
17
star
95

deep-learning-toolkit

Deep Learning Toolkit for Splunk
Python
15
star
96

k8s-yaml-patch

jsonnet library to patch objects loaded from yaml
Go
15
star
97

acs-cli

Admin Config Service CLI
15
star
98

TA-osquery

A Splunk technology add-on for osquery
14
star
99

ml-toolkit-docs

ML Toolkit & Showcase application documents
14
star
100

splunk-sdk-csharp

Splunk Software Development Kit for CSharp
C#
14
star