• Stars
    star
    287
  • Rank 144,232 (Top 3 %)
  • Language
    Go
  • License
    MIT License
  • Created over 4 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

#OSINT tool for finding Github repositories by extracting commit logs in real time from the Github event API

Commit Stream

GitHub GitHub all releases GitHub go.mod Go version Twitter Follow

commit-stream drinks commit logs from the Github event firehose exposing the author details (name and email address) associated with Github repositories in real time.

OSINT / Blueteam / Recon uses for Redteamers / Bug bounty hunters:

  • Uncover repositories which employees of a target company is commiting code (filter by email domain)
  • Identify repositories belonging to an individual (filter by author name)
  • Integration to Trufflehog with alert reporting via Slack channels
  • Supports logging to Postgres, MySql, Sqlite and Elastic Search databases

Companies have found the tool useful to discover repositories that their employees are committing intellectual property to.

asciicast

Installation

Binaries

Compiled 64-bit executable files for Windows, Mac and Linux are available here

Docker

docker run x1sec/commit-stream

Building from source

If you would prefer to build yourself (and Go is setup correctly):

go install github.com/x1sec/commit-stream@latest

Usage

./commit-stream | tee commits.txt

With no options specified, commit-stream will output to stdout in csv format.

Usage:
  commit-stream [OPTIONS]

Options:
  -t, --token            Github token (if not specified, will use environment
                         variable 'CSTREAM_TOKEN' or from config.yaml)
  -e, --email-domain     Match email addresses field (specify multiple with comma)
                         Omit to match all.
  -n, --email-name       Match author name field (specify multiple with comma).
                         Omit to match all.
  -df --dom-file <file>  Match email domains specificed in file
  -a  --all-commits      Search through previous commit history (default: false)
  -i  --ignore-priv      Ignore noreply.github.com private email addresses (default: false)
  -m  --messages         Fetch commit messages (default: false)
  -p  --public-events    Fetch on repositories made public (default: true) 
  -c  --config [path]    Use configuration file (optional)
  -d  --debug            Enable debug messages to stderr (default:false)
  -h  --help             This message

Tokens

commit-stream requires a Github personal access token to be used. You can generate a token navigating in Github [Settings / Developer Settings / Personal Access Tokens] then selecting 'Generate new token'. Nothing here needs to be selected, just enter the name of the token and click generate.

Once the token has been created, the recommended method is to set it via an environment variable CSTREAM_TOKEN:

export CSTREAM_TOKEN=xxxxxxxxxx

Alternatively, the --token switch maybe used when invoking the program, e.g:

./commit-stream --token xxxxxxxxxx

The token can also be specified in config.yaml:

github:
  token: ghp_xxxxx

Filtering

When running commit-stream with no options, it will immediately dump author details and the associated repositories in CSV format to the terminal. Filtering options are available.

To filter by email domain:

./commit-stream --email-domain 'company.com'

To filter by author name:

./commit-stream --email-name 'John Smith'

Multiple keywords can be specified with a , character. e.g.

./commit-stream --email-domain 'telsa.com,ford.com'

To filter on a list of domain names specified in a text file, use -df, --dom-file:

./commit-stream --dom-file domainlist.txt

Email addresses that have been set to private (@users.noreply.github.com) can be ommited by specifying --ignore-priv. This is useful to reduce the volume of data collected if running the tool for an extended period of time.

It is possible to search upto 20 previous commits for the filter keywords by specifying --all-commits. This may increase the likelihood of a positive matches.

Output handlers

In config.yaml, the destination parameter is set to one of the following options:

  • stdout
  • database
  • elasticsearch
  • slack
  • script
  • truffle
  • truffle-slack

The appropriate configuration for the destination handler is required.

Standard out

stdout handler is the default which outputs to a comma seperated values format to stdout which can be piped into a file. There are no other configuration options.

SQL Database

database handler writes events to a database with the database type specified by the engine parameter.

  • dsn must be specified for postgres and mysql
  • path must be specified for sqlite
database:
  # type is either: sqlite, mysql, postgres
  engine: postgres 
  
  # dsn required for mysql or postgres
  dsn: host=localhost user=postgres dbname=rob port=5432
  
  # path only required for sqlite
  path: ./test.db

Elastic Search

elasticsearch handler sends events to an elasticsearch database specified by the uri parameter:

elasticsearch:
  uri: http://127.0.0.1:9200
  no-duplicates: true

Note: no-duplicates is used to reduce the volume of data stored. Each document index is considered unique by the ID being a hash of the domain name and repository name (user/repo). Older documents with the same ID will be updated with newer commits as the arrive.

Basic auth is supported optionally supported with username and password parameters.

Slack

slack handler requires both a slack token and channel ID to be defined:

slack:
  token: xoxb-0000-0000-0000
  channel-id: myChannel 

Note: To prevent accidental flooding to Slack, a domain/email filter must be specified.

Script

script handler executes a shell script specified by path. Two parameters are passed to the script: Github user and Github repository name.

script:
  path: ./script/run.sh
  log-file: ./script/script.log
  max-workers: 10

An example of ./script/run.sh:

#!/bin/bash
echo "Github user: $1"
echo "Github repo: $2"
echo "URL: https://github.com/${1}/${2}"

Note: max-workers is the number of instances invoked in parallel.

Trufflehog

trufflehog handler requires the path of the trufflehog binary to be specified. A Github token is required and this should be different to the one specified in the main commit-stream configuration.

truffle:
  path: ./script/trufflehog
  max-workers: 5

  github-token: ghp_AAAAAAAAA

  ignore:
    - Parseur

Note: Trufflehog signatures can be ignored by specifying them in the ignore list. This reduces the amount of false positives.

Trufflehog with Slack notifications

trufflehog-slack handler runs trufflehog to search for secrets. This handler sends alerts to a Slack channel. Both both trufflehog and slack configurations must be defined.

Credits

Some inspiration was taken from @Darkport's ssshgit excellent tool to extract secrets from Github in real-time. commit-stream's objective is slightly different as it focuses on extracting the 'meta-data' as opposed to the content of the repositories.

Note

Github provides the ability to prevent email addresses from being exposed. In the Github settings select Keep my email addresses private and Block command line pushes that expose my email under the Email options.

As only one token is used this software does not breach any terms of use with Github. That said, use at your own risk. The author does not hold any responsibility for it's usage.

More Repositories

1

gojwtcrack

Fast JSON Web Token (JWT) cracker written in Go
Go
51
star
2

CVE-2019-19781

DFIR notes for Citrix ADC (NetScaler) appliances vulnerable to CVE-2019-19781
45
star
3

citrixmash_scanner

A fast multi threaded scanner for Citrix ADC (NetScaler) CVE-2019-19781 - Citrixmash / Shitrix
Go
36
star
4

citrix-honeypot

Citrix ADC (NetScaler) Honeypot. Supports detection for CVE-2019-19781 and login attempts
Go
24
star
5

amthoneypot

Honeypot for Intel's AMT Firmware Vulnerability CVE-2017-5689
Go
15
star
6

xpasn

Expands an autonomous system (AS) number into prefixes or individual host IP addresses
Go
12
star
7

junos-rpm-interface

Python interface to Juniper router / firewall RPM functionality
Python
7
star
8

reversemeta

How to use a reverse shell payload to connect back to Metasploit when behind a firewall
5
star
9

sharepoint-scanner

Multithreaded Microsoft SharePoint version / vulnerability scanner
Go
4
star
10

game-and-watch-doom-fire

C
3
star
11

obsidian-to-hugo-pages

Github workflow action that converts Obsidian notes to a static website hosted in Github pages
HTML
3
star
12

cddns

DDNS (Dynamic DNS) agent for Cloudflare
Go
2
star
13

mvt2timesketch

Mobile Verification Toolkit (MVT) timeline to Timesketch compatible import file
Python
2
star
14

defeat-lockscreen

Prevents Window's lock screen even if it's enforced via group policy
C
1
star
15

x1sec.github.io

HTML
1
star
16

FIRST

1
star
17

ipwatch

Mac OSX Menu Widget that displays your public IP address and related information
Python
1
star
18

procutil

Lists processes and their related TCP connections (native python)
Python
1
star
19

eyespy

Aggressively identify hosts/assets on your local LAN segment
Python
1
star
20

GTPDOOR-SCAN

Go
1
star