• Stars
    star
    136
  • Rank 258,317 (Top 6 %)
  • Language
    Ruby
  • License
    MIT License
  • Created about 16 years ago
  • Updated about 2 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Ruby Client API for Sphinx

Riddle

Build Status

Riddle is a Ruby library interfacing with the Sphinx full-text search tool. It is written by Pat Allan, and has been influenced by both Dmytro Shteflyuk's Ruby client and the original PHP client. It can be used for interactions with Sphinx's command-line tools searchd and indexer, sending search queries via the binary protocol, and programmatically generating Sphinx configuration files.

The syntax here, while closer to a usual Ruby approach than the PHP client, is quite old (Riddle was first published in 2007). While it would be nice to re-work things, it's really not a priority, given the bulk of Riddle's code is for Sphinx's deprecated binary protocol.

Installation

Riddle is available as a gem, so you can install it directly:

gem install riddle

Or include it in a Gemfile:

gem 'riddle', '~> 2.4'

Usage

As of version 1.0.0, Riddle supports multiple versions of Sphinx in the one gem - you'll need to require your specific version after a normal require, though. The latest distinct version is 2.1.0:

require 'riddle'
require 'riddle/2.1.0'

The full list of versions available are 0.9.8 (the initial base), 0.9.9, 1.10, 2.0.1, and 2.1.0. If you're using something more modern than 2.1.0, then just require that, and the rest should be fine (changes to the binary protocol since then are minimal).

Configuration

Riddle's structure for generating Sphinx configuration is very direct mapping to Sphinx's configuration options. First, create an instance of Riddle::Configuration:

config = Riddle::Configuration.new

This configuration instance has methods indexer, searchd and common, which return separate inner-configuration objects with methods mapping to the equivalent Sphinx settings. So, you may want to do the following:

config.indexer.mem_limit = '128M'
config.searchd.log       = '/my/log/file.log'

Similarly, there are two further methods indices and sources, which are arrays meant to hold instances of index and source inner-configuration objects respectively (all of which have methods matching their Sphinx settings). The available index classes are:

  • Riddle::Configuration::DistributedIndex
  • Riddle::Configuration::Index
  • Riddle::Configuration::RealtimeIndex
  • Riddle::Configuration::RemoteIndex
  • Riddle::Configuration::TemplateIndex

All of these index classes should be initialised with their name, and in the case of plain indices, their source objects. Remote indices take an address, port and name as their initialiser parameters.

index = Riddle::Configuration::Index.new 'articles', article_source_a, article_source_b
index.path    = '/path/to/index/files"
index.docinfo = 'external'

The available source classes are:

  • Riddle::Configuration::SQLSource
  • Riddle::Configuration::TSVSource
  • Riddle::Configuration::XMLSource

The initialising parameters are the name of the source, and the type of source:

source = Riddle::Configuration::SQLSource.new 'article_source', 'mysql'
source.sql_query = "SELECT id, title, body FROM articles"
source.sql_host  = "127.0.0.1"

Once you have created your configuration object tree, you can then generate the string representation and perhaps save it to a file:

File.write "sphinx.conf", configuration.render

It's also possible to parse an existing Sphinx configuration file into a configuration option tree:

configuration = Riddle::Configuration.parse! File.read('sphinx.conf')

Indexing and Starting/Stopping the Daemon

using Sphinx's command-line tools indexer and searchd via Riddle is all done via an instance of Riddle::Controller:

configuration_file = "/path/to/sphinx.conf"
configuration      = Riddle::Configuration.parse! File.read(configuration_file)
controller         = Riddle::Controller.new configuration, configuration_file

# set the path where the indexer and searchd binaries are located:
controller.bin_path = '/usr/local/bin'

# set different binary names if you're running a custom Sphinx installation:
controller.searchd_binary_name = 'sphinxsearchd'
controller.indexer_binary_name = 'sphinxindexer'

# process all indices:
controller.index
# process specific indices:
controller.index 'articles', 'books'
# rotate old index files out for the new ones:
controller.rotate

# start the daemon:
controller.start
# start the daemon and do not detach the process:
controller.start :nodetach => true
# stop the daemon:
controller.stop

The index, start and stop methods all accept a hash of options, and the :verbose option is respected in each case.

Each of these methods will return an instance of Riddle::CommandResult - or, if the command fails (as judged by the process status code), a Riddle::CommandFailedError exception is raised. These exceptions respond to the command_result method with the corresponding details.

SphinxQL Queries

Riddle does not have any code to send SphinxQL queries and commands to Sphinx. Because Sphinx uses the mysql41 protocol (thus, mimicing a MySQL database server), I recommend using the mysql2 gem instead. The connection code in Thinking Sphinx may provide some inspiration on this.

Binary Protocol Searching

Sphinx's legacy binary protocol does not have many of the more recent Sphinx features - such as real-time indices - as these are only available in the SphinxQL/mysql41 protocol. However, Riddle can still be used for the binary protocol if you wish.

To get started, just instantiate a Client object:

client = Riddle::Client.new # defaults to localhost and port 9312
client = Riddle::Client.new "sphinxserver.domain.tld", 3333 # custom settings

And then set the parameters to what you want, before running a query:

client.match_mode = :extended
client.query "Pat Allan @state Victoria"

The results from a query are similar to the other clients - but here's the details. It's a hash with the following keys:

  • :matches
  • :fields
  • :attributes
  • :attribute_names
  • :words
  • :total
  • :total_found
  • :time
  • :status
  • :warning (if appropriate)
  • :error (if appropriate)

The key :matches returns an array of hashes - the actual search results. Each hash has the document id (:doc), the result weighting (:weight), and a hash of the attributes for the document (:attributes).

The :fields and :attribute_names keys return list of fields and attributes for the documents. The key :attributes will return a hash of attribute name and type pairs, and :words returns a hash of hashes representing the words from the search, with the number of documents and hits for each, along the lines of:

results[:words]["Pat"] #=> {:docs => 12, :hits => 15}

:total, :total_found and :time return the number of matches available, the total number of matches (which may be greater than the maximum available), and the time in milliseconds that the query took to run.

:status is the error code for the query - and if there was a related warning, it will be under the :warning key. Fatal errors will be described under :error.

Contributing

Please note that this project has a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.

Riddle uses the git-flow process for development. The main branch is the latest released code (in a gem). The develop branch is what's coming in the next release. (There may be occasional feature and hotfix branches, although these are generally not pushed to GitHub.)

When submitting a patch to Riddle, please submit your pull request against the develop branch.

Contributors

Thanks to the following people who have contributed to Riddle in some shape or form:

  • Andrew Aksyonoff
  • Brad Greenlee
  • Lachie Cox
  • Jeremy Seitz
  • Mark Lane
  • Xavier Noria
  • Henrik Nye
  • Kristopher Chambers
  • Rob Anderton
  • Dylan Egan
  • Jerry Vos
  • Piotr Sarnacki
  • Tim Preston
  • Amir Yalon
  • Sam Goldstein
  • Matt Todd
  • Paco Guzmán
  • Greg Weber
  • Enrico Thierbach
  • Jason Lambert
  • Saberma
  • James Cook
  • Alexey Artamonov
  • Paul Gibler
  • Ngan Pham
  • Aaron Gilbralter
  • Steven Bristol
  • Ilia Lobsanov
  • Aleksey Morozov
  • S. Christoffer Eliesen
  • Rob Golkosky
  • Darcy Brown

More Repositories

1

thinking-sphinx

Sphinx/Manticore plugin for ActiveRecord/Rails
Ruby
1,620
star
2

combustion

Simple, elegant testing for Rails Engines
Ruby
674
star
3

gutentag

A good, simple, solid tagging extension for ActiveRecord.
Ruby
473
star
4

ts-delayed-delta

Manage delta indexes via Delayed Job for Thinking Sphinx
Ruby
73
star
5

thinking-sphinx-raspell

An add-on gem for spelling suggestions in Thinking Sphinx
Ruby
58
star
6

ts-datetime-delta

Manage delta indexes via datetime columns for Thinking Sphinx
Ruby
45
star
7

fakeweb-matcher

An RSpec matcher for the Fakeweb HTTP stubbing library
Ruby
37
star
8

sliver

A super simple, extendable Rack API.
Ruby
31
star
9

calendav

Interact with CalDAV via Ruby
Ruby
30
star
10

sphinx

Free open-source SQL full-text search engine
C++
26
star
11

active-matchers

Helpful rspec matchers for testing validations and associations.
Ruby
23
star
12

ginger

Run specs/tests multiple times through different gem versions
Ruby
22
star
13

render_api

Ruby interface for the render.com API.
Ruby
19
star
14

pat.github.com

HTML
17
star
15

railscamps.com

Rails Camps Website
HTML
15
star
16

pedantic

Pares text down to the words that matter
Ruby
14
star
17

pippin

A PayPal Rails Engine that handles IPNs
Ruby
14
star
18

drumknott-server

Server for storing static site search data.
Ruby
11
star
19

gyoza

Streamlined Editing for GitHub Pages
JavaScript
11
star
20

support-act

Encouraging people to buy albums alongside their streaming, to better support artists.
Ruby
11
star
21

not-a-mock

A cleaner and DRYer alternative to mocking and stubbing with RSpec.
Ruby
10
star
22

laughtrack

A festival buzz tracker
Ruby
10
star
23

joiner

Builds ActiveRecord joins from association paths
Ruby
7
star
24

numbr5

Thank-you bot for IRC (and maybe more)
Ruby
7
star
25

sphinx-tute

Project for the Sphinx Tutorial at RailsConf 2009
Ruby
6
star
26

sslocal-rb

Make local environment SSL as streamlined as possible.
Ruby
6
star
27

ruby-netcdf

Copy of source for ruby-netcdf gem, with tweak for MRI 2.0.0
C
6
star
28

enkoder

An extension to the Rails TextHelper module that can be used to protect email addresses (or other information) by obfuscating them using JavaScript code. Written by Dan Benjamin.
Ruby
6
star
29

nudge

Simple Static Site Deployer via Git
Ruby
6
star
30

radiant-tiny-mce

Tiny MCE Filter and Asset Management (via Paperclipped)
5
star
31

tramampoline

Trampoline Website, now with registrations
HTML
5
star
32

thin-glazed

SSL Proxy for HTTP Thin servers
Ruby
5
star
33

ts-sidekiq-delta

Thinking Sphinx - Sidekiq Deltas
Ruby
5
star
34

gzipped_tar

In-memory reading/writing of .tar.gz files
Ruby
5
star
35

radiant-layout-layer

Automatically create Radiant layouts based on HTML files in other extensions
Ruby
5
star
36

inkan

Unique file markers for tracking whether files have been changed.
Ruby
4
star
37

vcr_assistant

Manages VCR cassettes and set-up logic for RSpec.
Ruby
4
star
38

beer-tracker

Website/API partner for Numbr5
3
star
39

chargify-loops

A Rails Engine for Chargify Webhooks
Ruby
3
star
40

sslocal-js

Make local environment SSL as streamlined as possible.
JavaScript
3
star
41

laughtrack-couch

Couch Logic and Views for LaughTrack
JavaScript
2
star
42

postie

Ruby
2
star
43

spreadsheet-excel

Resurrecting an old gem.
Ruby
2
star
44

babushka-deps

Babushka Dependencies
Ruby
2
star
45

radiant-publican

Automatically copy Radiant extensions' public files on every load for the development environment
2
star
46

.js

JavaScript
1
star
47

shithead

A card game
Ruby
1
star
48

bb8

Keeps Terraform state and variables secret per environment, with help from Voltos
Ruby
1
star
49

lipwig

Write group emails in Markdown, send via Postmark or SMTP.
Ruby
1
star
50

ruby-event-guides

Tips and thoughts about running Rails Camps and RubyConf AU.
1
star
51

thinking-sphinx-examples

Example of advanced search form.
Ruby
1
star
52

trampolinemelb.com

Website for Trampoline, a cross-discipline unconference.
1
star
53

livecal

Translate ical/ics files into actual calendars and events
Ruby
1
star
54

isnotagithubberyet

Because Jan said I should - and who needs a better reason than that?
Ruby
1
star
55

resque-crashlog

Resque failure handler for crashlog.io
Ruby
1
star
56

lu-tze

Automated Backup Helper for Heroku
Ruby
1
star
57

json_template_benchmarking

Ruby
1
star