• Stars
    star
    177
  • Rank 208,344 (Top 5 %)
  • Language
    Ruby
  • License
    MIT License
  • Created almost 13 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Voight-Kampff is a Ruby gem that detects bots, spiders, crawlers and replicants

Voight-Kampff

Build Status Code Climate Gem Version

Voight-Kampff relies on a user agent list for its detection. It can easily tell you if a request is coming from a crawler, spider or bot. This can be especially helpful in analytics such as page hit tracking.

Installation

gem install voight_kampff

If you're using Rails and want to add ActionDispatch::Request#bot? and ActionDispatch::Request#human? methods, require voight_kampff/rails:

gem 'voight_kampff', require: 'voight_kampff/rails'

if you're using pure Rack, require it the following way:

gem 'voight_kampff', require: 'voight_kampff/rack'

Configuration

A JSON file is used to match user agent strings to a list of known bots.

If you'd like to use an updated list or make your own customizations, run rake voight_kampff:import_user_agents. This will download a crawler-user-agents.json file into the ./config directory.

Note: The pattern entries in the JSON file are evaluated as regular expressions.

Usage

There are three ways to use Voight-Kampff

  1. Through Rack::Request such as in your Ruby on Rails controllers:
    request.bot?

  2. Through the VoightKampff module:
    VoightKampff.bot? 'your user agent string'

  3. Through a VoightKampff::Test instance:
    VoightKampff::Test.new('your user agent string').bot?

All of the above examples accept human? and bot? methods. All of these methods will return true or false.

Upgrading to version 1.0

Version 1.0 uses a new source for a list of bot user agent strings since the old source was no longer maintained. This new source, unfortuately, does not include as much detail. Therefore the following methods have been deprecated:

  • #browser?
  • #checker?
  • #downloader?
  • #proxy?
  • #crawler?
  • #spam?

In general the #bot? command tends to include all of these and I'm sure it's unlikely that anybody was getting this granular with their bot checking. So I see it as a small price to pay for an open and up to date bot list.

Also, the gem no longer extends ActionDispatch::Request instead it extends Rack::Request which ActionDispatch::Request inherits from. This allows the same functionality for Rails while opening the gem up to other rack-based projects.

Upgrading to version 2.0

If you use Rails and ActionDispatch::Request#bot? and ActionDispatch::Request#human? methods, change your gemfile:

-gem 'voight_kampff'
+gem 'voight_kampff', require: 'voight_kampff/rails'

If you use Rack, change your gemfile:

-gem 'voight_kampff'
+gem 'voight_kampff', require: 'voight_kampff/rack'

FAQ

Q: What's with the name?
A: It's the machine in Blade Runner that is used to test whether someone is a human or a replicant.

Q: I've found a bot that isn't being matched
A: The list is being pulled from github.com/monperrus/crawler-user-agents. If you'd like to have entries added to the list, please create a pull request with that project. Once that pull request is merged, feel free to create an issue here and I'll release a new gem version with the updated list. In the meantime you can always run rake voight_kampff:import_user_agents on your project to get that updated list.

Q: __Why don't you use the user agent list from ______________ If you know of a better source for a list of bot user agent strings, please create an issue and let me know. I'm open to switching to a better source or supporting multiple sources. There are others out there but I like the openness of monperrus' list.

Thanks

Thanks to github.com/monperrus/crawler-user-agents for providing an open and easily updatable list of bot user agents.

Contributing

PR without tests will not get merged, Make sure you write tests for api and rails app. Feel free to ask for help, if you do not know how to write a determined test.

Running Tests?

  • bundle install
  • bundle exec rspec

More Repositories

1

turnout

Turnout makes it easy to put Rack apps into maintenance mode
Ruby
576
star
2

rack-cas

Rack-CAS is simple Rack middleware to perform CAS client authentication.
Ruby
149
star
3

punching_bag

Punching Bag is a hit tracking plugin for Ruby on Rails that specializes in simple trending
Ruby
99
star
4

chef-cookbooks

Cookbooks used by Biola University
Ruby
17
star
5

chef-omnibus_updater_windows

Chef cookbook for updating the chef-client installation on Windows
Ruby
6
star
6

adfs_theme

Customizations for SAML login screen
CSS
5
star
7

chef-zfs_linux

Chef cookbook for deploying ZFS on Linux
Ruby
5
star
8

appdoc

Allows you to add documents and documentation to your app
JavaScript
4
star
9

port-a-query

Simple ruby helper for generating portable SQL expressions.
Ruby
3
star
10

ask

Allow your site's maintainers to easily create forms.
Ruby
3
star
11

action_links

Quick and painless action links for your rails applications
Ruby
3
star
12

chronic_ping

Rails engine that uses ajax and chronic to parse date text_fields
Ruby
2
star
13

feed_satisfaction

Simple Ruby on Rails engine that allows you to easily add a Get Satisfaction feedback page to your app
Ruby
2
star
14

digital_signage_mac_client

This mac application is a simple WebView wrapper that is used with our digital signage web application.
Objective-C
2
star
15

humanity

Mix in Humanity and get common user model functionality
Ruby
2
star
16

chef-sssd_ad

Chef cookbook to set up AD authentication on Ubuntu systems using SSSD
Ruby
2
star
17

trogdir-models

A shared models gem for the Trogdir directory
Ruby
1
star
18

bbconnect-sync

Syncs contacts from Banner to Blackboard Connect
Ruby
1
star
19

biola-link-headers-footers

static content used by biola-csm.symplicity.com
CSS
1
star
20

sinatra-boilerplate

Boilerplate code for starting a simple Sinatra app.
Ruby
1
star
21

event-publisher

Ruby
1
star
22

styleguide

Styleguide for biola frontend websites.
JavaScript
1
star
23

biola.github.io

1
star
24

chef-vsphere_perl_sdk

Chef cookbook for deploying the VMware Perl SDK
Ruby
1
star
25

carrierwave-roz

Carrierwave storage plugin for the Roz assets API
Ruby
1
star
26

libstats

Fork of http://code.google.com/p/libstats/
PHP
1
star
27

buweb-api-client

Tie in for biola-web-api
Ruby
1
star
28

gatekeeper

User account creation and management app
Ruby
1
star
29

banner-syncinator

Sync data between Banner and Trogdir API
Ruby
1
star
30

google-syncinator-api-client

API consuming models for the Google Syncinator project
Ruby
1
star
31

biola-logs

Standardized, opinionated log formatting
Ruby
1
star
32

chef-mysql_management

Chef cookbook for managing MySQL databases, users, and backups
Ruby
1
star
33

csm-sync

Automated export of student and alumni data to Symplicity CSM
Ruby
1
star
34

biola-deploy

A collection of deployment rake tasks
Ruby
1
star
35

chef-oracle_instant_client

Chef cookbook for deploying Oracle Instant Client
Ruby
1
star
36

ta-nexentastor

1
star
37

trogdir-api

RESTful API for the trogdir directory
Ruby
1
star
38

chef-opsview

Chef cookbook for deploying Opsview Core
Ruby
1
star
39

chef-dns_caching

A Chef cookbook for managing DNS caching
HTML
1
star
40

biola-frontend-toolkit

A gem of generic tools and helpers for building Biola apps.
SCSS
1
star
41

mobile-student-app

Mobile student app using angular and ionic
JavaScript
1
star