• Stars
    star
    322
  • Rank 130,398 (Top 3 %)
  • Language
    Ruby
  • Created almost 16 years ago
  • Updated about 15 years ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Fetch exactly what you need

UPDATED

Please head on over to Stephen Sykes’s more recent implementation slim_scrooge – DB agnostic,
leaner and a native MRI callsite hash to negate Kernel#caller overheads

This is a complete rewrite from the initial coverage at igvita.com – read on below

Many thanks to Stephen Sykes ( pennysmalls.com ) for his time spent on shaping, implementing and troubleshooting this release.

An ActiveRecord optimization layer to ensure production Ruby applications only fetch the database content needed to minimize wire traffic, excessive SQL queries and reduce conversion overheads to native Ruby types.

Why bother ?

  • Object conversion and moving unnecessary data is both expensive and tax existing infrastructure in high load setups
  • Manually extracting and scoping SELECT clauses is not sustainable in a clean and painless manner with iterative development, even less so in large projects.
  • Preloading associations can be painful – delegate to Scrooge instead.

What it does


	Processing HotelsController#show (for 127.0.0.1 at 2009-03-18 19:29:38) [GET]
	  Parameters: {"action"=>"show", "id"=>"8699-radisson-hotel-waterfront-cape-town", "controller"=>"hotels"}
	  Hotel Load Scrooged (0.3ms)   SELECT `hotels`.id FROM `hotels` WHERE (`hotels`.`id` = 8699) 
	Rendering template within layouts/application
	Rendering hotels/show
	  SQL (0.2ms)   SELECT `hotels`.location_id,`hotels`.hotel_name,`hotels`.location,`hotels`.from_price,`hotels`.star_rating,`hotels`.apt,`hotels`.latitude,`hotels`.longitude,`hotels`.distance,`hotels`.narrative,`hotels`.telephone,`hotels`.important_notes,`hotels`.nearest_tube,`hotels`.nearest_rail,`hotels`.created_at,`hotels`.updated_at,`hotels`.id FROM `hotels` WHERE `hotels`.id IN ('8699')
	  Image Load Scrooged (0.2ms)   SELECT `images`.id FROM `images` WHERE (`images`.hotel_id = 8699) LIMIT 1
	  SQL (0.2ms)   SELECT `images`.hotel_id,`images`.title,`images`.url,`images`.width,`images`.height,`images`.thumbnail_url,`images`.thumbnail_width,`images`.thumbnail_height,`images`.has_thumbnail,`images`.created_at,`images`.updated_at,`images`.id FROM `images` WHERE `images`.id IN ('488')
	Rendered shared/_header (0.0ms)
	Rendered shared/_navigation (0.2ms)
	  Image Load Scrooged (0.2ms)   SELECT `images`.id FROM `images` WHERE (`images`.hotel_id = 8699) 
	  SQL (0.2ms)   SELECT `images`.hotel_id,`images`.title,`images`.url,`images`.width,`images`.height,`images`.thumbnail_url,`images`.thumbnail_width,`images`.thumbnail_height,`images`.has_thumbnail,`images`.created_at,`images`.updated_at,`images`.id FROM `images` WHERE `images`.id IN ('488')
	  Address Columns (306.2ms)   SHOW FIELDS FROM `addresses`
	  Address Load Scrooged (3.6ms)   SELECT `addresses`.id FROM `addresses` WHERE (`addresses`.hotel_id = 8699) LIMIT 1
	Rendered hotels/_show_sidebar (313.2ms)
	Rendered shared/_footer (0.1ms)
	Completed in 320ms (View: 8, DB: 311) | 200 OK [http://localhost/hotels/8699-radisson-hotel-waterfront-cape-town]


	Processing HotelsController#show (for 127.0.0.1 at 2009-03-18 19:29:40) [GET]
	  Parameters: {"action"=>"show", "id"=>"8699-radisson-hotel-waterfront-cape-town", "controller"=>"hotels"}
	  Hotel Load Scrooged (0.3ms)   SELECT `hotels`.narrative,`hotels`.from_price,`hotels`.star_rating,`hotels`.hotel_name,`hotels`.id FROM `hotels` WHERE (`hotels`.`id` = 8699) 
	  Address Load Scrooged (0.2ms)   SELECT `addresses`.id FROM `addresses` WHERE (`addresses`.hotel_id = 8699) 
	Rendering template within layouts/application
	Rendering hotels/show
	  Image Load Scrooged (0.3ms)   SELECT `images`.url,`images`.id,`images`.height,`images`.width FROM `images` WHERE (`images`.hotel_id = 8699) LIMIT 1
	Rendered shared/_header (0.1ms)
	Rendered shared/_navigation (0.2ms)
	  Image Load Scrooged (0.3ms)   SELECT `images`.thumbnail_width,`images`.id,`images`.thumbnail_height,`images`.thumbnail_url FROM `images` WHERE (`images`.hotel_id = 8699) 
	Rendered hotels/_show_sidebar (1.0ms)
	Rendered shared/_footer (0.1ms)
	Completed in 8ms (View: 5, DB: 1) | 200 OK [http://localhost/hotels/8699-radisson-hotel-waterfront-cape-town]
	

Suggested Use

Install, and you’re off to the races!

Installation

As a Rails plugin ( Recommended )

./script/plugin install git://github.com/methodmissing/scrooge.git

From Git

git pull git://github.com/methodmissing/scrooge.git

As a Gem

sudo gem install methodmissing-scrooge -s http://gems.github.com

Stability

The whole Rails 2.3.2 ActiveRecord test suite passes with scrooge, except for 13 failures related to callsite augmentation (note the SQL reload snippets below). Thoughts on handling or circumventing this much appreciated.


  2) Failure:
test_finding_with_includes_on_belongs_to_association_with_same_include_includes_only_once(EagerAssociationTest)
    [/opt/local/lib/ruby/gems/1.8/gems/activerecord-2.3.1/test/cases/../../lib/active_record/test_case.rb:31:in `assert_queries'
     /opt/local/lib/ruby/gems/1.8/gems/activerecord-2.3.1/test/cases/associations/eager_test.rb:139:in `test_finding_with_includes_on_belongs_to_association_with_same_include_includes_only_once'
     /opt/local/lib/ruby/gems/1.8/gems/activesupport-2.3.1/lib/active_support/testing/setup_and_teardown.rb:57:in `__send__'
     /opt/local/lib/ruby/gems/1.8/gems/activesupport-2.3.1/lib/active_support/testing/setup_and_teardown.rb:57:in `run']:
5 instead of 3 queries were executed.
Queries:
SELECT `posts`.id,`posts`.type FROM `posts` WHERE (`posts`.`id` = 1) 
SELECT `posts`.author_id,`posts`.title,`posts`.body,`posts`.comments_count,`posts`.taggings_count FROM `posts` WHERE (`posts`.`id` = 1) 
SELECT `authors`.name,`authors`.id FROM `authors` WHERE (`authors`.`id` = 1) 
SELECT `authors`.author_address_id,`authors`.author_address_extra_id FROM `authors` WHERE (`authors`.`id` = 1) 
SELECT `author_addresses`.id FROM `author_addresses` WHERE (`author_addresses`.`id` = 1) .
<3> expected but was
<5>.

To run tests in your environment :

  • Configure to run the ActiveRecord test suite as per the docs
  • ‘rake test’ from within the scrooge root directory
  • It’ll attempt to find the path to the ActiveRecord test cases through rubygems
  • Known to work with both 2.2.2, 2.3.0 and the upcoming 2.3.1

Initial Benchmarks

Passenger, Rails 2.2.2, remote DB :


Without scrooge:

Concurrency Level:      1
Time taken for tests:   68.279156 seconds
Complete requests:      150
Failed requests:        0
Write errors:           0
Total transferred:      13741201 bytes
HTML transferred:       13679100 bytes
Requests per second:    2.20 [#/sec] (mean)
Time per request:       455.194 [ms] (mean)
Time per request:       455.194 [ms] (mean, across all concurrent requests)
Transfer rate:          196.53 [Kbytes/sec] received

With scrooge:

Concurrency Level:      1
Time taken for tests:   58.162039 seconds
Complete requests:      150
Failed requests:        0
Write errors:           0
Total transferred:      13747200 bytes
HTML transferred:       13685100 bytes
Requests per second:    2.58 [#/sec] (mean)
Time per request:       387.747 [ms] (mean)
Time per request:       387.747 [ms] (mean, across all concurrent requests)
Transfer rate:          230.82 [Kbytes/sec] received

How it works

Callsites

Ruby allows introspection of the call tree through


  Kernel#caller
	

Scrooge analyzes the last 10 calltree elements that triggered


  ActiveRecord::Base.find_by_sql
	

Lets refer to that as a callsite, or signature.

Thus given SQL such as


"SELECT * FROM `images` WHERE (`images`.hotel_id = 11697)  LIMIT 1"
	

Called from our application helper


["/Users/lourens/projects/superbreak_app/vendor/plugins/scrooge/rails/../lib/scrooge.rb:27:in `find_by_sql'", "/Users/lourens/projects/superbreak_app/vendor/rails/activerecord/lib/active_record/base.rb:1557:in `find_every'", "/Users/lourens/projects/superbreak_app/vendor/rails/activerecord/lib/active_record/base.rb:1514:in `find_initial'", "/Users/lourens/projects/superbreak_app/vendor/rails/activerecord/lib/active_record/base.rb:613:in `find'", "/Users/lourens/projects/superbreak_app/vendor/rails/activerecord/lib/active_record/associations/association_collection.rb:60:in `find'", "/Users/lourens/projects/superbreak_app/vendor/rails/activerecord/lib/active_record/associations/association_collection.rb:67:in `first'", "/Users/lourens/projects/superbreak_app/app/helpers/application_helper.rb:60:in `hotel_image'", "/Users/lourens/projects/superbreak_app/app/views/hotels/_hotel.html.erb:4:in `_run_erb_app47views47hotels47_hotel46html46erb_locals_hotel_hotel_counter_object'", "/Users/lourens/projects/superbreak_app/vendor/rails/actionpack/lib/action_view/renderable.rb:36:in `send'", "/Users/lourens/projects/superbreak_app/vendor/rails/actionpack/lib/action_view/renderable.rb:36:in `render'", "/Users/lourens/projects/superbreak_app/vendor/rails/actionpack/lib/action_view/renderable_partial.rb:20:in `render'"]
	

We can generate a unique callsite identifier with the following calculation :


(The above calltree << "SELECT * FROM `images` ).hash " # cut off conditions etc.

Callsites are tracked on a per model ( table name ) basis.

Scope

Only SQL statements that meet the following criteria is considered for column optimizations :

  • A SELECT statement
  • Not a JOIN
  • The Model has a primary key defined

Only associations that meet the following criteria is associated with a callsite and preloaded
on subsequent requests :

  • Not a polymorphic association
  • Not a collection ( has_many etc. )

How it tracks

The ActiveRecord attributes Hash is replaced with a proxy that automatically augments the callsite with any attributes referenced through the Hash lookup keys. We’re also able to learn which associations is invoked from a given callsite, for preloading on subsequent requests.

Storage

There’s a slight memory hit for each model as the callsites is stored as a class level Hash, which is relatively lightweight and looks like this :


#<Scrooge::Callsite:0x3969968 @primary_key="id", @columns=#<Set: {"narrative", "from_price", "star_rating", "hotel_name", "id"}>, @associations=#<Set: {:address}>, @signature=-736202783, @klass=Hotel(id: integer, location_id: integer, hotel_name: string, location: string, from_price: float, star_rating: integer, apt: boolean, latitude: float, longitude: float, distance: float, narrative: text, telephone: string, important_notes: text, nearest_tube: string, nearest_rail: string, created_at: datetime, updated_at: datetime), @inheritance_column="type">

Tracking and scoping ?

The tracking and scoping phases is superseded by this implementation – none of those hindrances anymore.

Todo

  • Deeper coverage for Scrooge::AttributesProxy, possible handling of replace
  • More test cases for Scrooge::Callsite
  • Have invoking Model#attributes not associate all columns with the callsite
  • Avoid possible missing attribute exceptions for destroyed objects
  • Track rows of result set to allow more targeted loading of associations for a callsite

© 2009 Lourens Naudé (methodmissing) and Stephen Sykes (sdsykes)

More Repositories

1

rbczmq

Ruby extension that wraps the official high level ZeroMQ C API ( http://czmq.zeromq.org/ )
C
125
star
2

mysqlplus_adapter

Ruby
59
star
3

mri_instrumentation

dtrace pid provider hooks for mri
Ruby
43
star
4

hwia

Native MRI HashWithIndifferentAccess implementation
Ruby
37
star
5

eio

A libeio (http://software.schmorp.de/pkg/libeio.html) wrapper for Ruby
Shell
29
star
6

aftermath

An example CQRS implementation in the Ruby language
Ruby
27
star
7

aio

POSIX realtime bindings for Ruby
C
25
star
8

cb

Native MRI callback
Ruby
19
star
9

jio

jio - transactional, journaled file I/O for Ruby
C
18
star
10

io-trace

IO tracing framework for Ruby MRI
C
15
star
11

type_array

A Ruby TypeArray implementation based off the strawman ECMAScript spec ( http://wiki.ecmascript.org/doku.php?id=strawman:typed_arrays )
C
14
star
12

promise

Lightweight Ruby MRI promise extension
Ruby
14
star
13

vio

Ruby Vectored I/O
C
11
star
14

routing_with_optional_formats

Selectively enable or disable formatted named routes being generated
Ruby
11
star
15

memcached_store

ActiveSupport Cache Store for http://github.com/fauna/memcached/tree/master
Ruby
10
star
16

callsite

Native MRI Kernel#callsite implementation ( JRuby pending )
C
10
star
17

channel

Native MRI channel like object to complement github.com/methodmissing/callback/tree
C
9
star
18

rehash

Simple by key / value rehashing for Ruby MRI
C
9
star
19

erubis_template_handler

Rails 2.3 / Edge template handler for erubis ( http://www.kuwata-lab.com/erubis/ )
Ruby
7
star
20

railswaycon_imr_shorttalk

Inside Matz Ruby lightning type talk
Ruby
6
star
21

rb_io_perf

Posix AIO, mmap &amp;&amp; BSD scatter && gather experiments
C
6
star
22

relp

Ruby wrapper for the RELP (Reliable Event Logging Protocol) protocol
C
6
star
23

railswaycon_events

Embracing Events
Ruby
4
star
24

rb_darkstar_server

JRuby wrapper for the Project Darkstar Server component
Ruby
4
star
25

euruko2011

Euruko 2011 - In the Loop
D
4
star
26

stable_session_id

Stable Session Identifier support for Cookie Session Store
Ruby
3
star
27

cord

WIP implementation of a Concat Tree / Rope / Cord for Ruby MRI.
C
3
star
28

model_struct

Dump an ActiveRecord instance and related associations to Struct instances that can easily be pushed down the wire, sans dependencies, in conjunction with an EventMachine Marshal protocol.
3
star
29

cast

Fork of cast.rubyforge.org - current pending tracker items
Ruby
2
star
30

gap_buffer

WIP implementation of a Gap Buffer for Ruby MRI.
C++
1
star
31

methodmissing.github.com

1
star
32

blog

Code snippets referenced from my blog @ http://www.methodmissing.com
1
star
33

string_internals

A look into MRI Ruby's COW and buffer semantics for Strings
Ruby
1
star
34

librelp

Fork of librelp ( http://www.librelp.com/ ) with some cross platform fixes
Shell
1
star