• Stars
    star
    116
  • Rank 303,894 (Top 6 %)
  • Language
    Ruby
  • License
    MIT License
  • Created over 7 years ago
  • Updated over 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

Dataloader is a generic utility for batch data loading with caching, works great with GraphQL

Dataloader

Build Status codecov

Dataloader is a generic utility to be used as part of your application's data fetching layer to provide a simplified and consistent API to perform batching and caching within a request. It is heavily inspired by Facebook's dataloader.

Getting started

First, install Dataloader using bundler:

gem "dataloader"

To get started, instantiate Dataloader. Each Dataloader instance represents a unique cache. Typically instances are created per request when used within a web-server. To see how to use with GraphQL server, see section below.

Dataloader is dependent on promise.rb (Promise class) which you can use freely for batch-ready code (e.g. loader can return Promise that returns a Promise that returns a Promise). Dataloader will try to batch most of them.

Basic usage

# It will be called only once with ids = [0, 1, 2]
loader = Dataloader.new do |ids|
  User.find(*ids)
end

# Schedule data to load
promise_one = loader.load(0)
promise_two = loader.load_many([1, 2])

# Get promises results
user0 = promise_one.sync
user1, user2 = promise_two.sync

Using with GraphQL

You can pass loaders passed inside context.

UserType = GraphQL::ObjectType.define do
  field :name, types.String
end

QueryType = GraphQL::ObjectType.define do
  name "Query"
  description "The query root of this schema"

  field :user do
    type UserType
    argument :id, !types.ID
    resolve ->(obj, args, ctx) {
      ctx[:user_loader].load(args["id"])
    }
  end
end

Schema = GraphQL::Schema.define do
  lazy_resolve(Promise, :sync)

  query QueryType
end

context = {
  user_loader: Dataloader.new do |ids|
    User.find(*ids)
  end
}

Schema.execute("{ user(id: 12) { name } }", context: context)

Batching

You can create loaders by providing a batch loading function.

user_loader = Dataloader.new { |ids| User.find(*ids) }

A batch loading block accepts an Array of keys, and returns a Promise which resolves to an Array or Hash of values.

Dataloader will coalesce all individual loads which occur until first .sync is called on any promise returned by #load or #load_many, and then call your batch function with all requested keys.

user_loader.load(1)
  .then { |user| user_loader.load(user.invited_by_id)) }
  .then { |invited_by| "User 1 was invited by ${invited_by[:name]}" }

# Elsewhere in your backend
user_loader.load(2)
  .then { |user| user_loader.load(user.invited_by_id)) }
  .then { |invited_by| "User 2 was invited by ${invited_by[:name]}" }

A naive solution is to issue four SQL queries to get required information, but with Dataloader this application will make at most two queries (one to load users, and second one to load invites).

Dataloader allows you to decouple unrelated parts of your application without sacrificing the performance of batch data-loading. While the loader presents an API that loads individual values, all concurrent requests will be coalesced and presented to your batch loading function. This allows your application to safely distribute data fetching requirements throughout your application and maintain minimal outgoing data requests.

Batch function

A batch loading function accepts an Array of keys, and returns Array of values or Hash that maps from keys to values (or a Promise that returns such Array or Hash). There are a few constraints that must be upheld:

  • The Array of values must be the same length as the Array of keys.
  • Each index in the Array of values must correspond to the same index in the Array of keys.
  • If Hash is returned, it must include all keys passed to batch loading function

For example, if your batch function was provided the Array of keys: [ 2, 9, 6 ], you could return one of following:

[
  { id: 2, name: "foo" },
  { id: 9, name: "bar" },
  { id: 6, name: "baz" }
]
{
  2 => { id: 2, name: "foo" },
  9 => { id: 9, name: "bar" },
  6 => { id: 6, name: "baz" }
}

Caching

Dataloader provides a memoization cache for all loads which occur withing single instance of it. After #load is called once with a given key, the resulting Promise is cached to eliminate redundant loads.

In addition to relieving pressure on your data storage, caching results per-request also creates fewer objects which may relieve memory pressure on your application:

promise1 = user_loader.load(1)
promise2 = user_loader.load(1)
promise1 == promise2 # => true

Caching per-request

Dataloader caching does not replace Redis, Memcache, or any other shared application-level cache. DataLoader is first and foremost a data loading mechanism, and its cache only serves the purpose of not repeatedly loading the same data in the context of a single request to your Application. To do this, it maintains a simple in-memory memoization cache (more accurately: #load is a memoized function).

Avoid multiple requests from different users using the same Dataloader instance, which could result in cached data incorrectly appearing in each request. Typically, Dataloader instances are created when a request begins, and are not used once the request ends.

See Using with GraphQL section to see how you can pass dataloader instances using context.

Caching errors

If a batch load fails (that is, a batch function throws or returns a rejected Promise), then the requested values will not be cached. However if a batch function returns an Error instance for an individual value, that Error will be cached to avoid frequently loading the same Error.

In some circumstances you may wish to clear the cache for these individual Errors:

user_loader.load(1).rescue do |error|
  user_loader.cache.delete(1)
  raise error
end

Disabling cache

In certain uncommon cases, a Dataloader which does not cache may be desirable. Calling Dataloader.new({ cache: nil }) { ... } will ensure that every call to #load will produce a new Promise, and requested keys will not be saved in memory.

However, when the memoization cache is disabled, your batch function will receive an array of keys which may contain duplicates! Each key will be associated with each call to #load. Your batch loader should provide a value for each instance of the requested key.

loader = Dataloader.new({ cache: nil }) do |keys|
  puts keys
  some_loading_function(keys)
end

loader.load('A')
loader.load('B')
loader.load('A')

// > [ 'A', 'B', 'A' ]

API

Dataloader

Dataloader is a class for fetching data given unique keys such as the id column (or any other key).

Each Dataloader instance contains a unique memoized cache. Because of it, it is recommended to use one Datalaoder instance per web request. You can use more long-lived instances, but then you need to take care of manually cleaning the cache.

You shouldn't share the same dataloader instance across different threads. This behavior is currently undefined.

Dataloader.new(options = {}, &batch_load)

Create a new Dataloader given a batch loading function and options.

  • batch_load: A block which accepts an Array of keys, and returns Array of values or Hash that maps from keys to values (or a Promise that returns such value).
  • options: An optional hash of options:
    • :key A function to produce a cache key for a given load key. Defaults to function { |key| key }. Useful to provide when objects are keys and two similarly shaped objects should be considered equivalent.
    • :cache An instance of cache used for caching of promies. Defaults to Concurrent::Map.new.
      • The only required API is #compute_if_absent(key)).
      • You can pass nil if you want to disable the cache.
      • You can pass pre-populated cache as well. The values can be Promises.
    • :max_batch_size Limits the number of items that get passed in to the batchLoadFn. Defaults to INFINITY. You can pass 1 to disable batching.

#load(key)

key [Object] a key to load using batch_load

Returns a Promise of computed value.

You can resolve this promise when you actually need the value with promise.sync.

All calls to #load are batched until the first #sync is encountered. Then is starts batching again, et cetera.

#load_many(keys)

keys [Array] list of keys to load using batch_load

Returns a Promise of array of computed values.

To give an example, to multiple keys:

promise = loader.load_many(['a', 'b'])
object_a, object_b = promise.sync

This is equivalent to the more verbose:

promise = Promise.all([loader.load('a'), loader.load('b')])
object_a, object_b = promise.sync

#cache

Returns the internal cache that can be overridden with :cache option (see constructor)

This field is writable, so you can reset the cache with something like:

loader.cache = Concurrent::Map.new

#wait

Triggers all batched loaders until there are no keys to resolve.

This method is invoked automatically when the value of any promise is requested with #sync.

Here is the implementation that Dataloader sets as a default for Promise:

class Promise
  def wait
    Dataloader.wait
  end
end

License

MIT

More Repositories

1

vim-polyglot

A solid language pack for Vim.
Vim Script
5,419
star
2

prettier-standard

Formats with Prettier and lints with ESLint+Standard! (βœΏβ— β€Ώβ— )
JavaScript
864
star
3

graphqlviz

GraphQL Server schema visualizer
JavaScript
730
star
4

knex-migrate

Modern database migration toolkit for knex.js
JavaScript
342
star
5

bower-away

A tool for migrating away from Bower (to Yarn)
JavaScript
304
star
6

modern-node

All-in-one development toolkit for creating node modules with Jest, Prettier, ESLint, and Standard
JavaScript
242
star
7

dotfiles

Dotfiles meet chocolate and unicorns
Shell
206
star
8

ava-spec

Drop-in BDD helpers for AVA test runner πŸŽ‡ [DEPRECATED, please use Jest!]
JavaScript
144
star
9

vimrc

Basic vim configuration for your .vimrc
Vim Script
133
star
10

git-squash

Locally squash commits on a branch without resolving any conflicts (a'la squash and merge)
Shell
129
star
11

queue

Lightweight, thread-safe, blocking FIFO queue based on auto-resizing circular buffer
Go
70
star
12

extracted-loader

It reloads extracted stylesheets extracted with ExtractTextPlugin
JavaScript
64
star
13

s3_file_field

RETIRED: Please use shrine instead ❀️ http://shrinerb.com/rdoc/files/doc/direct_s3_md.html
Ruby
63
star
14

sublime-wombat-theme

Sublime Text 3 (and 2) theme and color scheme for Hackers
48
star
15

awesome-polish-nlp

Resources for doing NLP in Polish
42
star
16

babel-plugin-file-loader

Like webpack's file-loader, but on server side. Allows for production-grade require('./file.png')
JavaScript
38
star
17

vim-wombat-scheme

Awesome wombat-like scheme for Vim
Vim Script
33
star
18

filelock

Heavily tested, but simple filelocking solution using flock command.
Ruby
30
star
19

rails4-bootstrap

My systematic way of making bullet-proof Rails 4 bootstrap template.
Ruby
21
star
20

npm-packer

πŸ† Produces zero-dependencies node modules
JavaScript
17
star
21

githubsocial

Collaborative repository recommendations based on GitHub stars
Ruby
15
star
22

targets-webpack-plugin

Webpack plugin for transcompilig final bundles so they support legacy browsers
JavaScript
15
star
23

cross-run

[DEPRECATED] cross-env now supports npm scripts, please use it instead
JavaScript
9
star
24

git-commit-id

Returns commit id (commit sha) of git repository. Useful for Next.js or Sentry.
JavaScript
6
star
25

external-loader

Webpack loader for easy loading of external modules
JavaScript
5
star
26

rspec-gherkin

A bridge between (in principle) semi-format Gherkin features and formal RSpec features
Ruby
5
star
27

blog

CSS
5
star
28

vim-inspect

Debug Node in Vim with Inspector Protocol
Vim Script
5
star
29

yson

Zero-allocation, human-friendly JSON library for Go
Go
4
star
30

the-global-gitignore

An exhaustive global `.gitignore` file with 775 rules
Makefile
4
star
31

home

Pretty, short, one-line ZSH prompt that makes you feel at home
Shell
3
star
32

retile

Split image into overlaping tiles with similar dimensions
JavaScript
3
star
33

nimjs

Nim
3
star
34

extension-auto-reloader

Reload all development extensions on page refresh
3
star
35

neural-style-server

HTML
3
star
36

puppet-solo

Zero-config puppet executable for OS X and Ubuntu
Shell
3
star
37

csu-88

C-startup (Csu) 88 for MacOS
Assembly
2
star
38

component-map

Not invasive, performant, and garbage collected storage for React components (WeakMap based)
JavaScript
2
star
39

bowered

Bower client that integrates with Sprockets
Ruby
2
star
40

graphql-anywhere

TypeScript
1
star
41

fittastic

Ruby
1
star
42

teleconsole-docker

1
star
43

firebasehelpers

Set of firebase helpers to use for observing values and authentication
Go
1
star
44

bundled-selenium

JavaScript
1
star
45

ping

JavaScript
1
star
46

dockerfiles

Shell
1
star
47

relativistic-ray-tracer

Relativistic Ray Tracer in JavaScript
JavaScript
1
star