• Stars
    star
    1,518
  • Rank 29,730 (Top 0.7 %)
  • Language
    Ruby
  • License
    MIT License
  • Created almost 10 years ago
  • Updated 4 months ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Just the right amount of Rails eager loading

Goldiloader

Gem Version Build Status Code Climate Coverage Status

Wouldn't it be awesome if ActiveRecord didn't make you think about eager loading and it just did the "right" thing by default? With Goldiloader it can!

This branch only supports Rails 6.1+ with Ruby 3.0+. For older versions of Rails/Ruby use 4-x-stable, 3-x-stable, 2-x-stable or 1-x-stable.

Consider the following models:

class Blog < ActiveRecord::Base
  has_many :posts
end

class Post < ActiveRecord::Base
  belongs_to :blog
end

Here are some sample queries without the Goldiloader:

> blogs = Blog.limit(5).to_a
# SELECT * FROM blogs LIMIT 5

> blogs.each { |blog| blog.posts.to_a }
# SELECT * FROM posts WHERE blog_id = 1
# SELECT * FROM posts WHERE blog_id = 2
# SELECT * FROM posts WHERE blog_id = 3
# SELECT * FROM posts WHERE blog_id = 4
# SELECT * FROM posts WHERE blog_id = 5

Here are the same queries with the Goldiloader:

> blogs = Blog.limit(5).to_a
# SELECT * FROM blogs LIMIT 5

> blogs.each { |blog| blog.posts.to_a }
# SELECT * FROM posts WHERE blog_id IN (1,2,3,4,5)

Whoa! It automatically loaded all of the posts for our five blogs in a single database query without specifying any eager loads! Goldiloader assumes that you'll access all models loaded from a query in a uniform way. The first time you traverse an association on any of the models it will eager load the association for all the models. It even works with arbitrary nesting of associations.

Read more about the motivation for the Goldiloader in this blog post.

Installation

Add this line to your application's Gemfile:

gem 'goldiloader'

And then execute:

$ bundle

Or install it yourself as:

$ gem install goldiloader

Usage

By default all associations will be automatically eager loaded when they are first accessed so hopefully most use cases should require no additional configuration. Note you're still free to explicitly eager load associations via eager_load, includes, or preload.

Disabling Automatic Eager Loading

You can disable automatic eager loading with auto_include query scope method:

Blog.order(:name).auto_include(false)

Note this will not disable automatic eager loading for nested associations.

Automatic eager loading can be disabled for specific associations by customizing the association's scope:

class Blog < ActiveRecord::Base
  has_many :posts, -> { auto_include(false) }
end

Automatic eager loading can be disabled globally disabled for all threads:

# config/initializers/goldiloader.rb
Goldiloader.globally_enabled = false

Automatic eager loading can then be selectively enabled for particular sections of code:

# Using a block form
Goldiloader.enabled do
  # Automatic eager loading is enabled for the current thread
  # ...
end

# Using a non-block form
Goldiloader.enabled = true
# Automatic eager loading is enabled for the current thread
# ...
Goldiloader.enabled = false

Similarly, you can selectively disable automatic eager loading for particular sections of code in a thread local manner:

# Using a block form
Goldiloader.disabled do
  # Automatic eager loading is disabled for the current thread
  # ...
end

# Using a non-block form
Goldiloader.enabled = false
# Automatic eager loading is disabled for the current thread
# ...
Goldiloader.enabled = true

Note Goldiloader.enabled=, Goldiloader.enabled, and Goldiloader.disabled are thread local to ensure proper thread isolation in multi-threaded servers like Puma.

Association Options

Goldiloader supports a few options on ActiveRecord associations to customize its behavior.

fully_load

There are several association methods that ActiveRecord can either execute on in memory models or push down into SQL depending on whether or not the association is loaded. This includes the following methods:

  • first
  • second
  • third
  • fourth
  • fifth
  • forty_two (one of the hidden gems in Rails 4.1)
  • last
  • size
  • ids_reader
  • empty?
  • exists?

This can cause problems for certain usage patterns if we're no longer specifying eager loads:

> blogs = Blog.limit(5).to_a
# SELECT * FROM blogs LIMIT 5

> blogs.each do |blog|
    if blog.posts.exists?
      puts blog.posts
    else
      puts 'No posts'
  end
# SELECT 1 AS one FROM posts WHERE blog_id = 1 LIMIT 1
# SELECT * FROM posts WHERE blog_id IN (1,2,3,4,5)

Notice the first call to blog.posts.exists? was executed via SQL because the posts association wasn't yet loaded. The fully_load option can be used to force ActiveRecord to fully load the association (and do any necessary automatic eager loading) when evaluating methods like exists?:

class Blog < ActiveRecord::Base
  has_many :posts, fully_load: true
end

Limitations

Goldiloader leverages the ActiveRecord eager loader so it shares some of the same limitations. See eager loading workarounds for some potential workarounds.

has_one associations that rely on a SQL limit

You should not try to auto eager load (or regular eager load) has_one associations that actually correspond to multiple records and rely on a SQL limit to only return one record. Consider the following example:

class Blog < ActiveRecord::Base
  has_many :posts
  has_one :most_recent_post, -> { order(published_at: desc) }, class_name: 'Post'
end

With standard Rails lazy loading the most_recent_post association is loaded with a query like this:

SELECT * FROM posts WHERE blog_id = 1 ORDER BY published_at DESC LIMIT 1

With auto eager loading (or regular eager loading) the most_recent_post association is loaded with a query like this:

SELECT * FROM posts WHERE blog_id IN (1,2,3,4,5) ORDER BY published_at DESC

Notice the SQL limit can no longer be used which results in fetching all posts for each blog. This can cause severe performance problems if there are a large number of posts.

Other Limitations

Associations with any of the following options cannot be eager loaded:

  • limit
  • offset
  • finder_sql

Goldiloader detects associations with any of these options and disables automatic eager loading on them.

It might still be possible to eager load these with Goldiloader by using custom preloads.

Eager Loading Limitation Workarounds

Most of the Rails limitations with eager loading can be worked around by pushing the problematic SQL into the database via lateral joins (or database views if your database doesn't support lateral joins). Consider the following example with associations that can't be eager loaded due to SQL limits:

class Blog < ActiveRecord::Base
  has_many :posts
  has_one :most_recent_post, -> { order(published_at: desc) }, class_name: 'Post'
  has_many :recent_posts, -> { order(published_at: desc).limit(5) }, class_name: 'Post'
end

This can be reworked to push the order/limit into lateral joins like this:

class Blog < ActiveRecord::Base
  has_many :posts
  has_one :most_recent_post, -> {
    joins(Arel.sql(<<-SQL.squish))
      INNER JOIN LATERAL (
        SELECT id
        FROM posts p1
        WHERE blog_id = posts.blog_id
        ORDER BY published_at DESC
        LIMIT 1
      ) p2 on (p2.id = posts.id)
    SQL
  }, class_name: 'Post'
  has_many :recent_posts, -> {
    joins(Arel.sql(<<-SQL.squish))
      INNER JOIN LATERAL (
        SELECT id
        FROM posts p1
        WHERE blog_id = posts.blog_id
        ORDER BY published_at DESC
        LIMIT 5
      ) p2 on (p2.id = posts.id)
    SQL
  }, class_name: 'Post'
end

Custom Preloads

In addition to preloading relations, you can also define custom preloads by yourself in your model. The only requirement is that you need to be able to perform a lookup for multiple records/ids and return a single Hash with the ids as keys. If that's the case, these preloads can nearly be anything. Some examples could be:

  • simple aggregations (count, sum, maximum, etc.)
  • more complex custom SQL queries
  • external API requests (ElasticSearch, Redis, etc.)
  • relations with primary keys stored in a jsonb column

Here's how:

class Blog < ActiveRecord::Base
  has_many :posts
  
  def posts_count
    goldiload do |ids|
      # By default, `ids` will be an array of `Blog#id`s
      Post
        .where(blog_id: ids)
        .group(:blog_id)
        .count
    end
  end
end

The first time you call the posts_count method, it will call the block with all model ids from the current context and reuse the result from the block for all other models in the context.

A more complex example might use a custom primary key instead of id, use a non ActiveRecord API and have more complex return values than just scalar values:

class Post < ActiveRecord::Base
  def main_translator_reference
    json_payload[:main_translator_reference]
  end
  
  def main_translator
    goldiload(key: :main_translator_reference) do |references|
      # `references` will be an array of `Post#main_translator_reference`s
      SomeExternalApi.fetch_translators(
        id: references
      ).index_by(&:id)
    end
  end
end

If you want to preload something that is based on multiple keys, you can also pass an array:

class Meeting < ActiveRecord::Base
  def organizer_notes
    goldiload(key: [:organizer_id, :room_id]) do |id_sets|
      # +id_sets+ will be a two dimensional array with the
      # organizer_id and room_id for each item, e.g.
      # [
      #   [<organizer_id_1>, <room_id_1>],
      #   [<organizer_id_2>, <room_id_2>]
      # ]
      notes = logic_for_fetching_organizer_notes
      notes.group_by do |report|
        [report.organizer_id, report.room_id]
      end
    end
  end
end

Note: The goldiload method will use the source_location of the given block as a cache name to distinguish between multiple defined preloads. If this causes an issue for you, you can also pass a cache name explicitly as the first argument to the goldiload method.

Gotchas

Even though the methods in the examples above (posts_count, main_translator) are actually instance methods, the block passed to goldiload should not contain any references to these instances, as this could break the internal lookup/caching mechanism. We prevent this for the self keyword, so you'll get a NoMethodError. If you get this, you might want to think about the implementation rather than just trying to work around the exception.

Upgrading

From 0.x, 1.x

The auto_include association option has been removed in favor of the auto_include query scope method. Associations that specify this option must migrate to use the query scope method:

class Blog < ActiveRecord::Base
  # Old syntax
  has_many :posts, auto_include: false

  # New syntax
  has_many :posts, -> { auto_include(false) }
end

Status

This gem is tested with Rails 6.1, 7.0, 7.1, and Edge using MRI 3.0, 3.1, 3.2, and 3.3.

Let us know if you find any issues or have any other feedback.

Change log

See the change log.

Contributing

  1. Fork it
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create new Pull Request

More Repositories

1

jsonstreamingparser

A JSON streaming parser implementation in PHP.
PHP
718
star
2

ember-css-modules

CSS Modules for ambitious applications
JavaScript
284
star
3

safer_rails_console

Make rails console less dangerous!
Ruby
141
star
4

avro-builder

Ruby DSL to create Avro schemas
Ruby
102
star
5

avro-schema-registry

Implementation of the Confluent Schema Registry API as a Rails application
Ruby
87
star
6

avromatic

Generate Ruby models from Avro schemas
Ruby
85
star
7

offline-sort

A Ruby gem to sort large amounts of data using a predictable amount of memory.
Ruby
84
star
8

ember-cli-dependency-lint

Lint your app's addon dependencies, making sure you only have one version of each.
JavaScript
83
star
9

action-detect-and-tag-new-version

A GitHub action to detect and tag new versions of a repo based on changes to its contents
TypeScript
57
star
10

ember-cli-pact

Contract testing with Ember.js and Pact
JavaScript
42
star
11

omniauth-multi-provider

OmniAuth support for multiple providers of an authentication strategy
Ruby
42
star
12

rails-multitenant

Ruby
37
star
13

ember-debug-logger

An Ember addon for attaching debug logging to container-managed objects
JavaScript
37
star
14

delayed_job_worker_pool

Worker process pooling for Delayed Job
Ruby
35
star
15

botanist

A JavaScript DSL for traversing and transforming data based on structural rules
TypeScript
26
star
16

delayed_job_groups_plugin

Job groups for delayed_job - http://www.salsify.com/blog/adding-job-groups-to-delayed-job-in-rails
Ruby
18
star
17

milestones

Tools for finding your way through async code
TypeScript
15
star
18

arc-furnace

Need to melt, weave, and meld information together? Arc furnace will fuse anything you've got.
Ruby
14
star
19

ember-exclaim

An addon allowing apps to expose declarative, JSON-configurable custom UIs backed by Ember components
JavaScript
14
star
20

omniauth-multi-provider-saml

An extension to omniauth-saml for handling multiple identity providers
Ruby
14
star
21

delayed_job_heartbeat_plugin

Delayed::Job plugin to unlock jobs from dead workers
Ruby
12
star
22

ember-cli-sticky

JavaScript
11
star
23

postgres-vacuum-monitor

Simple stats collector for postgres auto vacuumer and long running queries
Ruby
8
star
24

broccoli-css-modules

A broccoli plugin for compiling modular CSS
JavaScript
8
star
25

avro-patches

Patches to the official Apache Avro ruby implementation
Ruby
6
star
26

salsify_rubocop

Salsify shared RuboCop configuration and experimental cops
Ruby
5
star
27

broccoli-gzip

Broccoli plugin to apply gzip compression to trees
JavaScript
4
star
28

elasticsearch-proxy

Ruby
2
star
29

activerecord-forbid_implicit_connection_checkout

Optionally prevent threads from checking out out an ActiveRecord connection
Ruby
2
star
30

logstash-codec-avro-data-file

Logstash codec for parsing Avro Data Files
Ruby
2
star
31

salsify-to-4-tell

Example project showing how to run a service for free on Heroku that takes data published from Salsify and pushes it to another service, in this case 4-Tell.
PHP
2
star
32

salsify-gtin

Validates and converts GTIN variants to standardized GTIN-14 representation
Ruby
2
star
33

multipartuploader

Small PHP library to make sending multipart uploads a little less painful.
PHP
2
star
34

zzz-test-commissioner

A CircleCI test failure aggregator and analysis tool
Ruby
2
star
35

avro_schema_registry-client

Client for the the avro-schema-registry app
Ruby
1
star
36

heroku_rails_deploy

Simple script for deploying a Rails project to Heroku
Ruby
1
star
37

tree_reject

Remove deeply nested keys from hash.
Ruby
1
star
38

delayed_job_chainable_hooks

Implement DelayedJob lifecyle hook methods without overriding previous definitions
Ruby
1
star
39

alexa-app

JavaScript
1
star
40

thrifty_charlock_holmes

A charlock holmes decidedly trimmer, and lacking in history
Ruby
1
star
41

customer-success-interview

1
star
42

ruby-exclaim

Exclaim UI processor for Ruby
Ruby
1
star
43

html-lambda-cli

Command line interface for creating HTML lambda's
JavaScript
1
star