• Stars
    star
    164
  • Rank 225,314 (Top 5 %)
  • Language
    Python
  • License
    BSD 3-Clause "New...
  • Created over 6 years ago
  • Updated 10 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

An open standard for hashing network flows into identifiers, a.k.a "Community IDs".

Community ID Flow Hashing

When processing flow data from a variety of monitoring applications (such as Zeek and Suricata), it's often desirable to pivot quickly from one dataset to another. While the required flow tuple information is usually present in the datasets, the details of such "joins" can be tedious, particular in corner cases. This spec describes "Community ID" flow hashing, standardizing the production of a string identifier representing a given network flow, to reduce the pivot to a simple string comparison.

Pseudo code

function community_id_v1(ipaddr saddr, ipaddr daddr, port sport, port dport, int proto, int seed=0)
{
    # Get seed and all tuple parts into network byte order
    seed = pack_to_nbo(seed); # 2 bytes
    saddr = pack_to_nbo(saddr); # 4 or 16 bytes
    daddr = pack_to_nbo(daddr); # 4 or 16 bytes
    sport = pack_to_nbo(sport); # 2 bytes
    dport = pack_to_nbo(dport); # 2 bytes

    # Abstract away directionality: flip the endpoints as needed
    # so the smaller IP:port tuple comes first.
    saddr, daddr, sport, dport = order_endpoints(saddr, daddr, sport, dport);

    # Produce 20-byte SHA1 digest. "." means concatenation. The
    # proto value is one byte in length and followed by a 0 byte
    # for padding.
    sha1_digest = sha1(seed . saddr . daddr . proto . 0 . sport . dport)

    # Prepend version string to base64 rendering of the digest.
    # v1 is currently the only one available.
    return "1:" + base64(sha1_digest)
}

function community_id_icmp(ipaddr saddr, ipaddr daddr, int type, int code, int seed=0)
{
    port sport, dport;

    # ICMP / ICMPv6 endpoint mapping directly inspired by Zeek
    sport, dport = map_icmp_to_ports(type, code);

    # ICMP is IP protocol 1, ICMPv6 would be 58
    return community_id_v1(saddr, daddr, sport, dport, 1, seed); 
}

Technical details

  • The Community ID is an additional flow identifier and doesn't need to replace existing flow identification mechanisms already supported by the monitors. It's okay, however, for a monitor to be configured to log only the Community ID, if desirable.

  • The Community ID can be computed as a monitor produces flows, or can also be added to existing flow records at a later stage assuming that said records convey all the needed flow endpoint information.

  • Collisions in the Community ID, while undesirable, are not considered fatal, since the user should still possess flow timing information and possibly the monitor's native ID mechanism (hopefully stronger than the Community ID) for disambiguation.

  • The hashing mechanism uses seeding to enable additional control over "domains" of Community ID usage. The seed defaults to 0, so this mechanism gets out of the way so it doesn't affect operation for operators not interested in it.

  • In version 1 of the ID, the hash algorithm is SHA1. Future hash versions may switch it or allow additional configuration.

  • The binary 20-byte SHA1 result gets base64-encoded to reduce output volume compared to the usual ASCII-based SHA1 representation. This assumes that space, not computation time, is the primary concern, and may become configurable in a later version.

  • The resulting flow ID includes a version number to make the underlying Community ID implementation explicit. This allows users to ensure they're comparing apples to apples while supporting future changes to the algorithm. For example, when one monitor's version of the ID incorporates VLAN IDs but another's does not, hash value comparisons should reliably fail. A more complex form of this feature could allow capturing configuration settings in addition to the implementation version.

    The versioning scheme currently simply prefixes the hash value with ":", yielding something like this in the current version 1:

    1:hO+sN4H+MG5MY/8hIrXPqc4ZQz0=

  • The hash input is aligned on 32-bit-boundaries. Flow tuple components use network byte order (big-endian) to standardize ordering regardless of host hardware.

  • The hash input is ordered to remove directionality in the flow tuple: swap the endpoints, if needed, so the numerically smaller IP:port tuple comes first. If the IP addresses are equal, the ports decide. For example, the following netflow 5-tuples create identical Community ID hashes because they both get ordered into the sequence 10.0.0.1, 127.0.0.1, 1234, 80.

    • Proto: TCP; SRC IP: 10.0.0.1; DST IP: 127.0.0.1; SRC Port: 1234; DST Port: 80
    • Proto: TCP; SRC IP: 127.0.0.1; DST IP: 10.0.0.1; SRC Port: 80; DST Port: 1234
  • This version includes the following protocols and fields:

    The above does not currently cover how to handle nesting (IP in IP, v6 over v4, etc) as well as encapsulations such as VLAN and MPLS.

  • If a network monitor doesn't support any of the above protocol constellations, it can safely report an empty string (or another non-colliding value) for the flow ID.

  • Consider v1 a prototype. Feedback from the community, particularly implementers and operational users of the ID, is greatly appreciated. Please create issues directly in the GitHub project at https://github.com/corelight/community-id-spec, or contact Christian Kreibich ([email protected]).

  • Many thanks for helpful discussion and feedback to Victor Julien, Johanna Amann, and Robin Sommer, and to all implementors and supporters.

Reference implementation

A complete implementation is available in the pycommunityid package. It includes a range of tests to verify correct computation for the various protocols. We recommend it to guide new implementations.

A smaller implementation is also available via the community-id.py script in this repository, including the byte layout of the hashed values (see packet_get_comm_id()). See --help and make.sh to get started:

  $ ./community-id.py --help
  usage: community-id.py [-h] [--seed NUM] PCAP [PCAP ...]

  Community flow ID reference

  positional arguments:
    PCAP         PCAP packet capture files

  optional arguments:
    -h, --help   show this help message and exit
    --seed NUM   Seed value for hash operations
    --no-base64  Don't base64-encode the SHA1 binary value
    --verbose    Show verbose output on stderr

For troubleshooting, the implementation supports omitting the base64 operation, and can provide additional detail about the exact sequence of bytes going into the SHA1 hash computation.

Reference data

The baseline directory in this repo contains datasets to help you verify that your implementation of Community ID functions correctly.

Reusable modules/libraries

Sought-after implementations (please get in touch if you're considering writing one of these!):

  • JavaScript

Production implementations

Feature requests in other projects

Talks

Blog posts and other resources

Discussion

Feel free to discuss aspects of the Community ID via GitHub here: https://github.com/corelight/community-id-spec/issues

More Repositories

1

zeek-cheatsheets

Zeek Log Cheatsheets
280
star
2

threat-hunting-guide

40
star
3

raspi-corelight

Corelight@Home script
Shell
37
star
4

ecs-mapping

Mapping Corelight or Zeek data to Elastic Common Schema fields
34
star
5

ripple20

A Zeek package for the passive detection of "Ripple20" vulnerabilities in the Treck TCP/IP stack.
Zeek
34
star
6

zeek2es

A Python application to filter and transfer Zeek logs to Elastic/OpenSearch+Humio. This app can also output pure JSON logs to stdout for further processing!
Python
33
star
7

zeek-community-id

Zeek support for Community ID flow hashing.
Zeek
32
star
8

cve-2022-26809

Detects attempts and successful exploitation of CVE-2022-26809
Zeek
32
star
9

cwrap

Auto wrap C and C++ functions with instrumentation
Perl
30
star
10

zeek-long-connections

Zeek package for tracking long connections to report them before they have completed.
Zeek
28
star
11

Elasticsearch_rules

Elastic version of SOC prime watcher rules
27
star
12

json-streaming-logs

Bro script package to create JSON formatted logs to stream into data analysis systems.
Zeek
27
star
13

pycommunityid

A Python implementation of the Community ID flow hashing standard
Python
24
star
14

cve-2021-44228

Log4j Exploit Detection Logic for Zeek
Zeek
18
star
15

http-stalling-detector

Detect HTTP stalling attacks like slowloris with Bro
Bro
18
star
16

detect-ransomware-filenames

Zeek
17
star
17

CVE-2021-42292

A Zeek package to detect CVE-2021-42292, a Microsoft Excel local privilege escalation exploit.
Zeek
17
star
18

corelight-client

Corelight Sensor API command-line client
Python
15
star
19

Dashboards-Splunk-DNS-Hunting-Beaconing

DNS Dashboard for hunting and identifying beaconing
14
star
20

log-add-http-post-bodies

Add POST body excerpt to Bro's HTTP log
Zeek
14
star
21

Corelight-Ansible-Roles

Corelight-Ansible-Roles are a collection of Ansible Roles and playbooks that install, configure, run and manage a variety of Corelight, Suricata and Zeek solutions.
Jinja
14
star
22

json-tcp-lb

line based tcp load balancing proxy.
Go
13
star
23

CVE-2021-31166

HTTP Protocol Stack CVE-2021-31166
Zeek
13
star
24

conn-burst

A Bro package to identify connections that are bursting (lots of data and transferring quickly).
Bro
12
star
25

suricata_exporter

A Prometheus Exporter for Suricata
Go
12
star
26

got_zoom

A Zeek package that detects Zoom logins and meeting joins
Zeek
12
star
27

zerologon

Zeek package to detect Zerologon
Zeek
12
star
28

zeek-elf

A Zeek ELF File Analyzer
Zeek
11
star
29

zeek-quic

Bro analyzer that detects Google's QUIC protocol
JavaScript
11
star
30

ecs-logstash-mappings

Mapping Corelight or Zeek data to Elastic Common Schema logs
11
star
31

top-dns

Top DNS Measurement for Bro
Zeek
11
star
32

SIGRed

Detection of attempts to exploit Microsoft Windows DNS server via CVE-2020-1350 (AKA SIGRed)
Zeek
10
star
33

CVE-2021-1675

Shell
9
star
34

CVE-2020-16898

A network detection package for CVE-2020-16898 (Windows TCP/IP Remote Code Execution Vulnerability)
Zeek
9
star
35

zeek-spicy-openvpn

A Zeek OpenVPN protocol analyzer, based on Spicy.
Zeek
8
star
36

zeekjs

ZeekJS - Experimental JavaScript support for Zeek.
C++
8
star
37

phantom-playbooks

Python
7
star
38

ecs-dashboards

7
star
39

pingback

A Zeek package to detect the Pingback malware ICMP tunnel command and control (C2) network traffic.
Zeek
7
star
40

ecs-templates

Corelight or Zeek Elastic Common Schema Templates
Python
7
star
41

zeek-openvpn

A Zeek OpenVPN protocol analyzer plugin.
JavaScript
7
star
42

zeek-spicy-ospf

A Zeek OSPF packet analyzer based on Spicy.
Zeek
7
star
43

docker-fleet-api-ci

Ubuntu-based builder including Go, NPM and Ruby tool FPM (for fleet-api)
Dockerfile
7
star
44

zeek-jpeg

A Zeek JPEG File Analyzer
Zeek
7
star
45

CVE-2020-14882-weblogicRCE

Detection of RCE in Oracle's WebLogic Server CVE-2020-14882 / CVE-2020-14750
Zeek
7
star
46

bro-maxminddb

Plugin to support libmaxminddb in Bro
CMake
6
star
47

zeek-spicy-ipsec

A Zeek IPSec protocol analyzer based on Spicy.
Zeek
6
star
48

CVE-2021-38647

CVE-2021-38647 AKA "OMIGOD" vulnerability in Windows OMI
Zeek
6
star
49

log-add-vlan-everywhere

Add VLAN tags to all Zeek logs
Zeek
6
star
50

callstranger-detector

Zeek Plugin that detects CallStranger (CVE-2020-12695) attempts (http://callstranger.com/)
Zeek
6
star
51

zeek-xor-exe-plugin

Zeek plugin to detect and decrypt XOR-encrypted EXEs
C++
6
star
52

CVE-2022-26937

A Zeek package to detect CVE-2022-26937, a vulnerability in the Network Lock Manager (NLM) protocol in Windows NFS server.
Shell
5
star
53

CVE-2020-5902-F5BigIP

A network detection package for CVE-2020-5902, a CVE10.0 vulnerability affecting F5 Networks, Inc BIG-IP devices.
Zeek
5
star
54

CVE-2022-3602

Detects attempts at exploitation of CVE-2022-3602, a remote code execution vulnerability in OpenSSL v 3.0.0 through v.3.0.6
Zeek
4
star
55

cve-2022-21907

cve-2022-21907
Zeek
4
star
56

plotcap

Plot packet and data rates over time given a PCAP file, with gnuplot.
Rust
4
star
57

c-community-id

A reusable C implementation of the Community ID standard
C
4
star
58

zeek-spicy-stun

A Zeek STUN protocol analyzer based on Spicy.
Zeek
4
star
59

zeek-spicy-wireguard

A Zeek Wireguard protocol analyzer based on Spicy.
Zeek
4
star
60

zeek-macho

A Zeek Mach-o File Analyzer
Zeek
4
star
61

icannTLD

Zeek script using the official ICANN Top-Level Domain (TLD) list with the Input Framework to extract the relevant information from a DNS query and mark whether it's trusted or not. The source of the ICANN TLD's can be found here: https://publicsuffix.org/list/effective_tld_names.dat. The Trusted Domains list is a custom list, created by the user, to filter domains during searches.
Zeek
4
star
62

CVE-2022-24497

A Zeek detector for CVE-2022-24497.
Shell
3
star
63

redxor

Detection of Linux Malware C2 RedXOR - demonstration
Zeek
3
star
64

C2-detection-manjusaka

Detection of Manjusaka C2 framework
3
star
65

zeek-indenter

A python package to indent Zeek scripts per the Whitesmiths coding style.
Python
3
star
66

zeek-smb-clear-state

reduce amount of tracked smb state
Zeek
3
star
67

Chronicle

Chronicle parser for CORELIGHT and related information.
Python
3
star
68

bro-hardware

Hardware description script module for Bro.
Bro
3
star
69

CVE-2022-24491

A Zeek CVE-2022-24491 detector.
Zeek
3
star
70

docker-terraform-serverless

Dockerfile building Serverless with Terraform for CI/CD
Dockerfile
3
star
71

ztest

Zeek Unit Testing. Provides a framework to write unit tests for Zeek scripts.
Zeek
3
star
72

bro-shellshock

ShellShock attack and exploit detector for Bro.
Bro
2
star
73

zeekjs-notice-telegram

Zeek Notice Telegram (ZeekJS edition)
JavaScript
2
star
74

PetitPotam

Zeek
2
star
75

http-more-files-names

Add more filenames to files.log from HTTP requests
Zeek
2
star
76

bro-drwatson

Dr. Watson catcher script for Bro.
Bro
2
star
77

ansible-awx-docker-bundle

Jinja
2
star
78

zeek-ssl-clear-state

Clear SSL State earlier to reduce memory usage
Zeek
2
star
79

hassh

Fingerprint SSH clients and servers.
Zeek
2
star
80

bro-protosigs

Purely signature based protocol detection for Bro
Standard ML
1
star
81

CVE-2022-23270-PPTP

A Zeek package to detect CVE-2022-23270, a PPTP vulnerability in Windows.
Shell
1
star
82

go-zeek-broker-ws

A Go library for using zeek broker's websocket API
Go
1
star
83

softsensor-docker-prototype

Softsensor Docker prototype
Shell
1
star
84

zeek-notice-telegram

Send Notices as messages over Telegram
Zeek
1
star
85

cve-2022-22954

Zeek
1
star
86

boa-detector

A vulnerable Boa web server detector.
Shell
1
star
87

zeek-spicy-facefish

A Zeek protocol analyzer for the Facefish rootkit, based on Spicy.
CMake
1
star
88

zeek-globload

Zeek package to support glob patterns in the @load directive
Shell
1
star
89

alpine-aws

Alpine docker container preloaded with AWS CLI and Git for CI/CD
Dockerfile
1
star
90

zeek-ta-splunk

Zeek TA Splunk
1
star
91

zeek-spicy-radius

A Zeek Radius protocol analyzer, written in Spicy.
Zeek
1
star
92

CVE-2022-30216

Zeek detection logic for CVE-2022-30216.
Zeek
1
star
93

CVE-2021-41773

A Zeek package which raises notices for Path Traversal/RCE in Apache HTTP Server 2.4.49 (CVE-2021-41773) and 2.4.50 (CVE-2021-42013)
Zeek
1
star
94

zeek-asyncrat-detector

A Zeek based AsyncRAT malware detector.
Shell
1
star