• Stars
    star
    264
  • Rank 149,912 (Top 4 %)
  • Language
    Python
  • License
    MIT License
  • Created over 9 years ago
  • Updated 9 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Threat Intelligence APIs

threat_intel Build Status: master PyPI

Threat Intelligence APIs.

Supported threat intelligence feeds

The package contains API wrappers for:

  • Umbrella Investigate API
  • VirusTotal API v2.0
  • ShadowServer API

Umbrella Investigate API

Umbrella Investigate provides an API that allows querying for:

  • Domain categorization
  • Security information about a domain
  • Co-occurrences for a domain
  • Related domains for a domain
  • Domains related to an IP
  • Domain tagging dates for a domain
  • DNS RR history for a domain
  • WHOIS information
    • WHOIS information for an email
    • WHOIS information for a nameserver
    • Historical WHOIS information for a domain
  • Latest malicious domains for an IP

To use the Investigate API wrapper import InvestigateApi class from threat_intel.opendns module:

from threat_intel.opendns import InvestigateApi

To initialize the API wrapper you need the API key:

investigate = InvestigateApi("<INVESTIGATE-API-KEY-HERE>")

You can also specify a file name where the API responses will be cached in a JSON file, to save you the bandwidth for the multiple calls about the same domains or IPs:

investigate = InvestigateApi("<INVESTIGATE-API-KEY-HERE>", cache_file_name="/tmp/cache.opendns.json")

Domain categorization

Calls domains/categorization/?showLabels Investigate API endpoint. It takes a list (or any other Python enumerable) of domains and returns the categories associated with this domains by Umbrella along with a [-1, 0, 1] score, where -1 is a malicious status.

domains = ["google.com", "baidu.com", "bibikun.ru"]
investigate.categorization(domains)

will result in:

{
    "baidu.com": {"status": 1, "content_categories": ["Search Engines"], "security_categories": []},
    "google.com": {"status": 1, "content_categories": ["Search Engines"], "security_categories": []},
    "bibikun.ru": {"status": -1, "content_categories": [], "security_categories": ["Malware"]}
}

Security information about a domain

Calls security/name/ Investigate API endpoint. It takes any Python enumerable with domains, e.g. list, and returns several security parameters associated with each domain.

domains = ["google.com", "baidu.com", "bibikun.ru"]
investigate.security(domains)

will result in:

{
  "baidu.com": {
    "found": true,
    "handlings": {
      "domaintagging": 0.00032008666962131285,
      "blocked": 0.00018876906157154347,
      "whitelisted": 0.00019697641207465407,
      "expired": 2.462205150933176e-05,
      "normal": 0.9992695458052232
    },
    "dga_score": 0,
    "rip_score": 0,

    ..

  }
}

Co-occurrences for a domain

Calls recommendations/name/ Investigate API endpoint. Use this method to find out a list of co-occurence domains (domains that are being accessed by the same users within a small window of time) to the one given in a list, or any other Python enumerable.

domains = ["google.com", "baidu.com", "bibikun.ru"]
investigate.cooccurrences(domains)

will result in:

{
  "baidu.com": {
    "found": true,
    "pfs2": [
      ["www.howtoforge.de", 0.14108563836506008],
    }

    ..

}

Related domains for a domain

Calls links/name/ Investigate API endpoint. Use this method to find out a list of related domains (domains that have been frequently seen requested around a time window of 60 seconds, but that are not associated with the given domain) to the one given in a list, or any other Python enumerable.

domains = ["google.com", "baidu.com", "bibikun.ru"]
investigate.related_domains(domains)

will result in:

{
    "tb1": [
        ["t.co", 11.0],
        ]

    ..

}

Domain tagging dates for a domain

Calls domains/name/ Investigate API endpoint.

Use this method to get the date range when the domain being queried was a part of the Umbrella block list and how long a domain has been in this list

domains = ["google.com", "baidu.com", "bibikun.ru"]
investigate.domain_tag(domains)

will result in:

{
    'category': u'Malware',
    'url': None,
    'period': {
        'begin': u'2013-09-16',
        'end': u'Current'
        }

    ..

}

DNS RR history for a Domain

Calls dnsdb/name/a/ Investigate API endpoint. Use this method to find out related domains to domains given in a list, or any other Python enumerable.

domains = ["google.com", "baidu.com", "bibikun.ru"]
investigate.dns_rr(domains)

will result in:

{
    'features': {
        'geo_distance_mean': 0.0,
        'locations': [
            {
                'lat': 59.89440155029297,
                'lon': 30.26420021057129
            }
                    ],
        'rips': 1,
        'is_subdomain': False,
        'ttls_mean': 86400.0,
        'non_routable': False,
        }

    ..

}

DNS RR history for an IP

Calls dnsdb/ip/a/ Investigate API endpoint. Use this method to find out related domains to the IP addresses given in a list, or any other Python enumerable.

ips = ['8.8.8.8']
investigate.rr_history(ips)

will result in:

{
  "8.8.8.8": {
    "rrs": [
      {
        "name": "8.8.8.8",
        "type": "A",
        "class": "IN",
        "rr": "000189.com.",
        "ttl": 3600
      },
      {
        "name": "8.8.8.8",
        "type": "A",
        "class": "IN",
        "rr": "008.no-ip.net.",
        "ttl": 60
      },
    }

    ..

}

WHOIS information for a domain

WHOIS information for an email

Calls whois/emails/{email} Investigate API endpoint.

Use this method to see WHOIS information for the email address. For now the Umbrella API will only return at most 500 results.

emails = ["[email protected]"]
investigate.whois_emails(emails)

will result in:

{
    "[email protected]": {
        "totalResults": 500,
        "moreDataAvailable": true,
        "limit": 500,
        "domains": [
            {
                "domain": "0emm.com",
                "current": true
            },
            ..
        ]
    }
}
WHOIS information for a nameserver

Calls whois/nameservers/{nameserver} Investigate API endpoint.

Use this method to see WHOIS information for the nameserver. For now the Umbrella API will only return at most 500 results.

nameservers = ["ns2.google.com"]
investigate.whois_nameservers(nameservers)

will result in:

{
    "ns2.google.com": {
        "totalResults": 500,
        "moreDataAvailable": true,
        "limit": 500,
        "domains": [
            {
                "domain": "46645.biz",
                "current": true
            },
            ..
        ]
    }
}
WHOIS information for a domain

Calls whois/{domain} Investigate API endpoint.

Use this method to see WHOIS information for the domain.

domains = ["google.com"]
investigate.whois_domains(domains)

will result in:

{
    "administrativeContactFax": null,
    "whoisServers": null,
    "addresses": [
        "1600 amphitheatre parkway",
        "please contact [email protected], 1600 amphitheatre parkway",
        "2400 e. bayshore pkwy"
    ],
    ..
}
Historical WHOIS information for a domain

Calls whois/{domain}/history Investigate API endpoint.

Use this method to see historical WHOIS information for the domain.

domains = ["5esb.biz"]
investigate.whois_domains_history(domains)

will result in:

{
    '5esb.biz':[
        {
            u'registrantFaxExt':u'',
            u'administrativeContactPostalCode':u'656448',
            u'zoneContactCity':u'',
            u'addresses':[
                u'nan qu hua yuan xiao he'
            ],
            ..
        },
        ..
    ]
}

Latest malicious domains for an IP

Calls ips/{ip}/latest_domains Investigate API endpoint.

Use this method to see whether the IP address has any malicious domains associated with it.

ips = ["8.8.8.8"]
investigate.latest_malicious(ips)

will result in:

{
    [
        '7ltd.biz',
        'co0s.ru',
        't0link.in',
    ]

    ..
}

VirusTotal API

VirusTotal provides an API that makes it possible to query for the reports about:

* Domains
* URLs
* IPs
* File hashes
* File Upload
* Live Feed
* Advanced search

To use the VirusTotal API wrapper import VirusTotalApi class from threat_intel.virustotal module:

from threat_intel.virustotal import VirusTotalApi

To initialize the API wrapper you need the API key:

vt = VirusTotalApi("<VIRUSTOTAL-API-KEY-HERE>")

VirusTotal API calls allow to squeeze a list of file hashes or URLs into a single HTTP call. Depending on the API version you are using (public or private) you may need to tune the maximum number of the resources (file hashes or URLs) that could be passed in a single API call. You can do it with the resources_per_req parameter:

vt = VirusTotalApi("<VIRUSTOTAL-API-KEY-HERE>", resources_per_req=4)

When using the public API your standard request rate allows you too put maximum 4 resources per request. With private API you are able to put up to 25 resources per call. That is also the default value if you don't pass the resources_per_req parameter.

Of course when calling the API wrapper methods in the VirusTotalApi class you can pass as many resources as you want and the wrapper will take care of producing as many API calls as necessary to satisfy the request rate.

You can also specify the file name where the responses will be cached:

vt = VirusTotalApi("<VIRUSTOTAL-API-KEY-HERE>", cache_file_name="/tmp/cache.virustotal.json")

Domain report endpoint

Calls domain/report VirusTotal API endpoint. Pass a list or any other Python enumerable containing the domains:

domains = ["google.com", "baidu.com", "bibikun.ru"]
vt.get_domain_reports(domains)

will result in:

{
  "baidu.com": {
    "undetected_referrer_samples": [
      {
        "positives": 0,
        "total": 56,
        "sha256": "e3c1aea1352362e4b5c008e16b03810192d12a4f1cc71245f5a75e796c719c69"
      }
    ],

    ..

    }
}

URL report endpoint

Calls url/report VirusTotal API endpoint. Pass a list or any other Python enumerable containing the URL addresses:

urls = ["http://www.google.com", "http://www.yelp.com"]
vt.get_url_reports(urls)

will result in:

{
  "http://www.google.com": {
    "permalink": "https://www.virustotal.com/url/dd014af5ed6b38d9130e3f466f850e46d21b951199d53a18ef29ee9341614eaf/analysis/1423344006/",
    "resource": "http://www.google.com",
    "url": "http://www.google.com/",
    "response_code": 1,
    "scan_date": "2015-02-07 21:20:06",
    "scan_id": "dd014af5ed6b38d9130e3f466f850e46d21b951199d53a18ef29ee9341614eaf-1423344006",
    "verbose_msg": "Scan finished, scan information embedded in this object",
    "filescan_id": null,
    "positives": 0,
    "total": 62,
    "scans": {
      "CLEAN MX": {
        "detected": false,
        "result": "clean site"
      },
    }
  ..

}

URL scan endpoint

Calls 'url/scan' VirusTotal API endpoint. Submit a url or any other Python enumerable containing the URL addresses:

urls = ["http://www.google.com", "http://www.yelp.com"]
vt.get_url_reports(urls)

Hash report endpoint

Calls file/report VirusTotal API endpoint. You can request the file reports passing a list of hashes (md5, sha1 or sha2):

file_hashes = [
    "99017f6eebbac24f351415dd410d522d",
    "88817f6eebbac24f351415dd410d522d"
]

vt.get_file_reports(file_hashes)

will result in:

{
  "88817f6eebbac24f351415dd410d522d": {
    "response_code": 0,
    "resource": "88817f6eebbac24f351415dd410d522d",
    "verbose_msg": "The requested resource is not among the finished, queued or pending scans"
  },
  "99017f6eebbac24f351415dd410d522d": {
    "scan_id": "52d3df0ed60c46f336c131bf2ca454f73bafdc4b04dfa2aea80746f5ba9e6d1c-1423261860",
    "sha1": "4d1740485713a2ab3a4f5822a01f645fe8387f92",
  }

 ..

}

Hash rescan endpoint

Calls file/rescan VirusTotal API endpoint. Use to rescan a previously submitted file. You can request the file reports passing a list of hashes (md5, sha1 or sha2):

Hash behaviour endpoint

Calls file/behaviour VirusTotal API endpoint. Use to get a report about the behaviour of the file when executed in a sandboxed environment (Cuckoo sandbox). You can request the file reports passing a list of hashes (md5, sha1 or sha2):

file_hashes = [
    "99017f6eebbac24f351415dd410d522d",
    "88817f6eebbac24f351415dd410d522d"
]

vt.get_file_behaviour(file_hashes)

Hash network-traffic endpoint

Calls file/network-traffic VirusTotal API endpoint. Use to get the dump of the network traffic generated by the file when executed. You can request the file reports passing a list of hashes (md5, sha1 or sha2):

file_hashes = [
    "99017f6eebbac24f351415dd410d522d",
    "88817f6eebbac24f351415dd410d522d"
]

vt.get_file_network_traffic(file_hashes)

Hash download endpoint

Calls file/download VirusTotal API endpoint. Use to download a file by its hash. You can request the file reports passing a list of hashes (md5, sha1 or sha2):

file_hashes = [
    "99017f6eebbac24f351415dd410d522d",
    "88817f6eebbac24f351415dd410d522d"
]

vt.get_file_download(file_hashes)

IP reports endpoint

Calls ip-address/report VirusTotal API endpoint. Pass a list or any other Python enumerable containing the IP addresses:

ips = ['90.156.201.27', '198.51.132.80']
vt.get_ip_reports(ips)

will result in:

{
  "90.156.201.27": {
    "asn": "25532",
    "country": "RU",
    "response_code": 1,
    "as_owner": ".masterhost autonomous system",
    "verbose_msg": "IP address found in dataset",
    "resolutions": [
      {
        "last_resolved": "2013-04-01 00:00:00",
        "hostname": "027.ru"
      },
      {
        "last_resolved": "2015-01-20 00:00:00",
        "hostname": "600volt.ru"
      },

      ..

    ],
    "detected_urls": [
      {
        "url": "http://shop.albione.ru/",
        "positives": 2,
        "total": 52,
        "scan_date": "2014-04-06 11:18:17"
      },
      {
        "url": "http://www.orlov.ru/",
        "positives": 3,
        "total": 52,
        "scan_date": "2014-03-05 09:13:31"
      }
    ],
  },

  "198.51.132.80": {

    ..

  }
}

URL live feed endpoint

Calls url/distribution VirusTotal API endpoint. Use to get a live a feed with the latest URLs submitted to VirusTotal.

vt.get_url_distribution()

Hash live feed endpoint

Calls file/distribution VirusTotal API endpoint. Use to get a live a feed with the latest Hashes submitted to VirusTotal.

vt.get_file_distribution()

Hash search endpoint

Calls file/search VirusTotal API endpoint. Use to search for samples that match some binary/metadata/detection criteria.

vt.get_file_search()

File date endpoint

Calls file/clusters VirusTotal API endpoint. Use to list simililarity clusters for a given time frame.

vt.get_file_clusters()

ShadowServer API

ShadowServer provides and API that allows to test the hashes against a list of known software applications.

To use the ShadowServer API wrapper import ShadowServerApi class from threat_intel.shadowserver module:

from threat_intel.shadowserver import ShadowServerApi

To use the API wrapper simply call the ShadowServerApi initializer:

ss = ShadowServerApi()

You can also specify the file name where the API responses will be cached:

ss = ShadowServerApi(cache_file_name="/tmp/cache.shadowserver.json")

To check whether the hashes are on the ShadowServer list of known hashes, call get_bin_test method and pass enumerable with the hashes you want to test:

file_hashes = [
    "99017f6eebbac24f351415dd410d522d",
    "88817f6eebbac24f351415dd410d522d"
]

ss.get_bin_test(file_hashes)

Installation

Install with pip

$ pip install threat_intel

Testing

Go to town with make:

$ sudo pip install tox
$ make test

More Repositories

1

elastalert

Easy & Flexible Alerting With ElasticSearch
Python
7,926
star
2

dumb-init

A minimal init system for Linux containers
Python
6,624
star
3

detect-secrets

An enterprise friendly way of detecting and preventing secrets in code.
Python
3,395
star
4

mrjob

Run MapReduce jobs on Hadoop or Amazon Web Services
Python
2,609
star
5

osxcollector

A forensic evidence collection & analysis toolkit for OS X
Python
1,858
star
6

paasta

An open, distributed platform as a service
Python
1,655
star
7

undebt

A fast, straightforward, reliable tool for performing massive, automated code refactoring
Python
1,632
star
8

MOE

A global, black box optimization engine for real world metric optimization.
C++
1,306
star
9

dockersh

A shell which places users into individual docker containers
Go
1,282
star
10

dataset-examples

Samples for users of the Yelp Academic Dataset
Python
1,189
star
11

yelp.github.io

A showcase of projects we've open sourced and open source projects we use
JavaScript
701
star
12

bravado

Bravado is a python client library for Swagger 2.0 services
Python
600
star
13

yelp-api

Examples of code using our v2 API
PHP
580
star
14

service-principles

A guide to service principles at Yelp for our service oriented architecture
423
star
15

swagger-gradle-codegen

💫 A Gradle Plugin to generate your networking code from Swagger
Kotlin
407
star
16

pyleus

Pyleus is a Python framework for developing and launching Storm topologies.
Python
406
star
17

mysql_streamer

MySQLStreamer is a database change data capture and publish system.
Python
405
star
18

yelp-fusion

Yelp Fusion API
Python
396
star
19

docker-custodian

Keep docker hosts tidy
Python
354
star
20

android-school

The best videos from the Android community and beyond
349
star
21

Tron

Next generation batch process scheduling and management
Python
340
star
22

kafka-utils

Python
312
star
23

bento

A delicious framework for building modularized Android user interfaces, by Yelp.
Kotlin
305
star
24

Testify

A more pythonic testing framework.
Python
303
star
25

clusterman

Cluster Autoscaler for Kubernetes and Mesos
Python
295
star
26

kotlin-android-workshop

A Kotlin Workshop for engineers familiar with Java and Android development.
Kotlin
289
star
27

python-gearman

Gearman API - Client, worker, and admin client interfaces
Python
242
star
28

nrtsearch

A high performance gRPC server on top of Apache Lucene
Java
239
star
29

py_zipkin

Provides utilities to facilitate the usage of Zipkin in Python
Python
223
star
30

fuzz-lightyear

A pytest-inspired, DAST framework, capable of identifying vulnerabilities in a distributed, micro-service ecosystem through chaos engineering testing and stateful, Swagger fuzzing.
Python
193
star
31

yelp-python

A Python library for the Yelp API
Python
182
star
32

venv-update

Synchronize your virtualenv quickly and exactly.
Python
178
star
33

firefly

Firefly is a web application aimed at powerful, flexible time series graphing for web developers.
JavaScript
171
star
34

amira

AMIRA: Automated Malware Incident Response & Analysis
Python
151
star
35

YLTableView

Objective-C
146
star
36

love

A system to share your appreciation
Python
141
star
37

aactivator

Automatically source and unsource a project's environment
Python
139
star
38

lemon-reset

Consistent, cross-browser React DOM tags, powered by CSS Modules. 🍋
JavaScript
131
star
39

detect-secrets-server

Python
109
star
40

bravado-core

Python
108
star
41

data_pipeline

Data Pipeline Clientlib provides an interface to tail and publish to data pipeline topics.
Python
108
star
42

dataloader-codegen

🤖 dataloader-codegen is an opinionated JavaScript library for automatically generating DataLoaders over a set of resources (e.g. HTTP endpoints).
TypeScript
107
star
43

yelp-ruby

A Ruby gem for communicating with the Yelp REST API
Ruby
105
star
44

swagger_spec_validator

Python
103
star
45

ybinlogp

A fast mysql binlog parser
C
97
star
46

beans

Bringing people together, one cup of coffee at a time
Python
90
star
47

casper

A fast web application platform built in Rust and Luau
Rust
86
star
48

schematizer

A schema store service that tracks and manages all the schemas used in the Data Pipeline
Python
85
star
49

requirements-tools

requirements-tools contains scripts for working with Python requirements, primarily in applications.
Python
81
star
50

osxcollector_output_filters

Filters that process and transform the output of osxcollector
Python
76
star
51

sensu_handlers

Custom Sensu Handlers to support a multi-tenant environment, allowing checks themselves to emit the type of handler behavior they need in the event json
Ruby
75
star
52

kegmate

Arduino/iPad powered kegerator
Objective-C
72
star
53

graphql-guidelines

GraphQL @ Yelp Schema Guidelines
Makefile
70
star
54

ephemeral-port-reserve

Find an unused port, reliably
Python
66
star
55

parcelgen

Helpful tool to make data objects easier for Android
Python
65
star
56

yelp-ios

Objective-C
62
star
57

salsa

A tool for exporting iOS components into Sketch 📱💎
Swift
62
star
58

docker-observium

Observium docker image with both professional and community edition support, ldap auth, and easy plugin support.
ApacheConf
57
star
59

yelp-android

Java
55
star
60

terraform-provider-signalform

SignalForm is a terraform provider to codify SignalFx detectors, charts and dashboards
Go
44
star
61

mycroft

Python
42
star
62

terraform-provider-gitfile

Terraform provider for checking out git repositories and making changes
Go
40
star
63

pidtree-bcc

eBPF tool for logging process ancestry of outbound TCP connections
Python
40
star
64

ffmpeg-android

Shell
39
star
65

pushmanager

Pushmanager is a web application to manage source code deployments.
Python
38
star
66

zygote

A Python HTTP process management utility.
Python
38
star
67

yelp_kafka

An extension of the kafka-python package that adds features like multiprocess consumers.
Python
38
star
68

pgctl

Manage sets of developer services -- "playground control"
Python
31
star
69

EMRio

Elastic MapReduce instance optimizer
Python
31
star
70

s3mysqldump

Dump mysql tables to s3, and parse them
Python
31
star
71

pyramid_zipkin

Pyramid tween to add Zipkin service spans
Python
28
star
72

android-varanus

A client-side Android library to monitor and limit network traffic sent by your apps
Kotlin
27
star
73

puppet-netstdlib

A collection of Puppet functions for interacting with the network
Ruby
27
star
74

sqlite3dbm

sqlite-backed dictionary conforming to the dbm interface
Python
27
star
75

send_nsca

Pure-python NSCA client
Python
26
star
76

data_pipeline_avro_util

Provides a Pythonic interface for reading and writing Avro schemas
Python
26
star
77

cocoapods-readonly

Automatically locks all CocoaPod source files.
Ruby
26
star
78

uwsgi_metrics

Python
26
star
79

docker-push-latest-if-changed

Python
25
star
80

WebImageView

An enhanced and improved ImageView for Android that displays images loaded over the interwebs
Java
25
star
81

task_processing

Interfaces and shared infrastructure for generic task processing at Yelp.
Python
23
star
82

PushmasterApp

(Legacy) Yelp pushmaster application built on Google App Engine
Python
22
star
83

tlspretense-service

A Docker container that exposes tlspretense on a port.
Makefile
20
star
84

puppet-uchiwa

Puppet module for installing Uchiwa
Ruby
20
star
85

yelp_cheetah

cheetah, hacked by yelpers
Python
20
star
86

logfeeder

Python
20
star
87

fido

Asynchronous HTTP client built on top of Crochet and Twisted
Python
20
star
88

pyramid-hypernova

A Python client for Airbnb's Hypernova server, for use with the Pyramid web framework.
Python
19
star
89

swagger-spec-compatibility

Python library to check Swagger Spec backward compatibility
Python
19
star
90

mr3po

protocols for use with mrjob
Python
16
star
91

YPFastDateParser

A class for parsing strings into NSDate instances, several times faster than NSDateFormatter
Objective-C
15
star
92

yelp_uri

Utilities for dealing with URIs, invented and maintained by Yelp.
Python
14
star
93

pysensu-yelp

A Python library to emit Sensu events that the Yelp Sensu Handlers can understand for Self-Service Sensu Monitoring
Python
14
star
94

terraform-provider-cloudhealth

Terraform provider for Cloudhealth
Go
14
star
95

yelp-rails-example

An example Rails application that uses the Yelp gem to integrate with the API
Ruby
13
star
96

named_decorator

Dynamically name wrappers based on their callees to untangle profiles of large python codebases
Python
12
star
97

pt-online-schema-change-plugins

Perl
11
star
98

puppet-cron

A super great cron Puppet module with timeouts, locking, monitoring, and more!
Ruby
11
star
99

doloop

Task loop for keeping things updated
Python
10
star
100

environment_tools

Tools for programmatically describing Yelp's different environments (prod, dev, stage)
Python
10
star