• Stars
    star
    466
  • Rank 94,105 (Top 2 %)
  • Language
  • Created over 5 years ago
  • Updated about 1 year ago

Reviews

There are no reviews yet. Be the first to send feedback to the community and the maintainers!

Repository Details

This Background Jobs style guide is a list of best practices working with Ruby background jobs.

Active Job Style Guide

This style guide is a list of best practices working with Ruby background jobs using Active Job with Sidekiq backend.

Despite the common belief, they work quite well together if you follow the guidelines.

Sidekiq may be used without Active Job, but the latter adds transparency and a useful serialization layer.

This style guide didn’t appear out of thin air - it is based on the professional experience of the editors, official documentation, and suggestions from members of the Ruby community.

Those guidelines help to avoid numerous pitfalls. Depending on the usage of background jobs, some guidelines might apply, and some not.

You can generate a PDF copy of this guide using AsciiDoctor PDF, and an HTML copy with AsciiDoctor using the following commands:

# Generates README.pdf
asciidoctor-pdf -a allow-uri-read README.adoc

# Generates README.html
asciidoctor
Tip

Install the rouge gem to get nice syntax highlighting in the generated document.

gem install rouge

General Recommendations

Active Record Models as Arguments

Pass Active Record models as arguments; do not pass by id. Active Job automatically serializes and deserializes Active Record models using GlobalID, and manual deserialization of the models is not necessary.

GlobalID handles model class mismatches properly.

Deserialization errors are reported to error tracking.

# bad - passing by id
# Deserialization error is reported, the job *is* scheduled for retry.
class SomeJob < ApplicationJob
  def perform(model_id)
    model = Model.find(model_id)
    do_something_with(model)
  end
end

# bad - model mismatch
class SomeJob < ApplicationJob
  def perform(model_id)
    Model.find(model_id)
    # ...
  end
end

# Will try to fetch a Model using another model class, e.g. User's id.
SomeJob.perform_later(user.id)

# acceptable - passing by id
# Deserialization error is reported, the job is *not* scheduled for retry.
class SomeJob < ApplicationJob
  def perform(model_id)
    model = Model.find(model_id)
    do_something_with(model)
  rescue ActiveRecord::RecordNotFound
    Rollbar.warning('Not found')
  end
end

# good - passing with GlobalID
# Deserialization error is reported, the job is *not* scheduled for retry.
class SomeJob < ApplicationJob
  def perform(model)
    do_something_with(model)
  end
end
Warning
Do not replace one style with another, use a transitional period to let all jobs scheduled with ids to be processed. Use a helper to temporarily support both numeric and GlobalID arguments.
class SomeJob < ApplicationJob
  include TransitionHelper

  def perform(model)
    # TODO: remove this when all jobs with numeric id arguments are processed
    model = fetch(model, Model)
    do_something_with(model)
  end
end

module TransitionHelper
  def fetch(id_or_object, model_class)
    case id_or_object
    when Numeric
      model_class.find(id_or_object)
    when model_class
      id_or_object
    else
      fail "Object type mismatch #{model_class}, #{id_or_object}"
    end
  end
end

Queue Assignments

Explicitly specify a queue to be used in job classes. Make sure the queue is on the list of processed queues.

Putting all jobs into one basket comes with a risk of more urgent jobs being executed with a significant delay. Do not put slow and fast jobs together in one queue. Do not put urgent and non-urgent jobs together in one queue.

# bad - no queue specified
class SomeJob < ApplicationJob
  def perform
    # ...
  end
end

# bad - the wrong queue specified
class SomeJob < ApplicationJob
  queue_as :hgh_prioriti # nonexistent queue specified

  def perform
    # ...
  end
end

# good
class SomeJob < ApplicationJob
  queue_as :high_priority

  def perform
    # ...
  end
end

Idempotency

Ideally, jobs should be idempotent, meaning there should be no bad side effects of them running more than once. Sidekiq only guarantees that the jobs will run at least once, but not necessarily exactly once.

Even jobs that do not fail due to errors might be interrupted during non-rolling-release deployments.

class UserNotificationJob < ApplicationJob
  def perform(user)
    send_email_to(user) unless already_notified?(user)
  end
end

Atomicity

During deployment, a job is given 25 seconds to complete by default. After that, the worker is terminated and the job is sent back to the queue. This might result in part of the work being executed twice.

Make the jobs atomic, i.e., all or nothing.

Threads

Do not use threads in your jobs. Spawn jobs instead. Spinning up a thread in a job leads to opening a new database connection, and the connections are easily exhausted, up to the point when the webserver is down.

# bad - consumes all available connections
class SomeJob < ApplicationJob
  def perform
    User.find_each |user|
      Thread.new do
        ExternalService.update(user)
      end
    end
  end
end

# good
class SomeJob < ApplicationJob
  def perform(user)
    ExternalService.update(user)
  end
end

User.find_each |user|
  SomeJob.perform_later(user)
end

Retries

Avoid using ActiveJob’s built-in retry_on or ActiveJob::Retry (activejob-retry gem). Use Sidekiq retries, which are also available from within Active Job with Sidekiq 6+.

Do not hide or extract job retry mechanisms. Keep retries directives visible in the jobs.

# bad - makes three attempts without submitting to Rollbar,
# fails and relies on Sidekiq's retry that would also make several
# retry attempts, submitting each of the failures to Rollbar.
class SomeJob < ApplicationJob
  retry_on ThirdParty::Api::Errors::SomeError, wait: 1.minute, attempts: 3

  def perform(user)
    # ...
  end
end

# bad - it's not clear upfront if the job will be retried or not
class SomeJob < ApplicationJob
  include ReliableJob

  def perform(user)
    # ...
  end
end

# good - Sidekiq deals with retries
class SomeJob < ApplicationJob
  sidekiq_options retry: 3

  def perform(user)
    # ...
  end
end

Batches

Always use retries for jobs that are executed in batches, otherwise, the batch will never succeed.

Use Retries

Use the retry mechanism. Do not let jobs end up in Dead Jobs. Let Sidekiq retry the jobs, and don’t spend time re-running the jobs manually.

Mind Transactions

Background processing of a scheduled job may happen sooner than you expect. Make sure to only schedule jobs when the transaction has been committed.

# bad - job may perform earlier than the transaction is committed
User.transaction do
  users_params.each do |user_params|
    user = User.create!(user_params)
    NotifyUserJob.perform_later(user)
  end
end

# good
users = User.transaction do
          users_params.map do |user_params|
            User.create!(user_params)
          end
        end
users.each { |user| NotifyUserJob.perform_later(user) }

Local Performance Testing

Due to Rails auto-reloading, Sidekiq jobs are executed one-by-one, with no parallelism. That may be confusing.

Run Sidekiq in an environment that has eager_load set to true, or with the following flags to circumvent this behavior:

EAGER_LOAD=true ALLOW_CONCURRENCY=true bundle exec sidekiq

Critical Jobs

Background job processing may be down for a prolonged period (minutes), e.g. during a failed deployment or a burst of other jobs.

Consider running time-critical and mission-critical jobs in-process.

Business Logic in Jobs

Do not put business logic to jobs; extract it.

# bad
class SendUserAgreementJob < ApplicationJob
  # Convenient method to check if preconditions are satisfied to avoid
  # scheduling unnecessary jobs.
  def self.perform_later_if_applies(user)
    job = new(user)
    return unless job.satisfy_preconditions?

    job.enqueue
  end

  def perform(user)
    @user = user
    return unless satisfy_preconditions?

    agreement = agreement_for(user: user)
    AgreementMailer.deliver_now(agreement)
  end

  def satisfy_preconditions?
    legal_agreement_signed? &&
      !user.removed? &&
      !user.referral? &&
      !(user.active? || user.pending?) &&
      !user.has_flag?(:on_hold)
  end

  private

  attr_reader :user

  # business logic
end

# good - business logic is not coupled to the job
class SendUserAgreementJob < ApplicationJob
  def perform(user)
    agreement = agreement_for(user: user)
    AgreementMailer.deliver_now(agreement)
  end
end

SendUserAgreementJob.perform_later(user) if satisfy_preconditions?

Scheduling a Job from a Job

Weigh the pros and cons in each case, whether to schedule jobs from jobs or to execute them in-process. Factors to consider: Is it a retriable job? Can inner jobs fail? Are they idempotent? Is there anything in the host job that may fail?

# good - error kernel pattern
# bad - additional jobs are spawned
class SomeJob < ApplicationJob
  def perform
    SomeMailer.some_notification.deliver_later
    OtherJob.perform_later
  end
end

# good - no additional jobs
# bad - if `OtherJob` fails, `SomeMailer` will be re-executed on retry as well
class SomeJob < ApplicationJob
  def perform
    SomeMailer.some_notification.deliver_now
    OtherJob.perform_now
  end
end

Numerous Jobs

When a lot of jobs should be performed, it’s acceptable to schedule them.

Consider using batches for improved traceability.

Also, specify the same queue for the host job and sub-jobs.

# acceptable
def perform
  batch = Sidekiq::Batch.new
  batch.description = 'Send weekly reminders'
  batch.jobs do
    User.find_each do |user|
      WeeklyReminderJob.perform_later(user)
    end
  end
end

Job Renaming

Carefully rename job classes to avoid situations with jobs are scheduled, but there’s no class to process it.

Note
This also relates to mailers used with deliver_later.
# good - keep the old class
# TODO: Delete this alias in a few weeks when old jobs are safely gone
OldJob = NewJob

sleep

Do not use Kernel.sleep in jobs. sleep blocks the worker thread, and it’s not able to process other jobs. Re-schedule the job for a later time, or use limiters with a custom exception.

# bad
class SomeJob < ApplicationJob
  def perform(user)
    attempts_number = 3
    ThirdParty::Api::User.renew(user.external_id)
  rescue ThirdParty::Api::Errors::TooManyRequestsError => error
    sleep(error.retry_after)
    attempts_number -= 1
    retry unless attempts_number.zero?
    raise
  end
end

# good - retry job in a while, a limited number of times
class SomeJob < ApplicationJob
  sidekiq_options retry: 3
  sidekiq_retry_in do |count, exception|
    case exception
    when ThirdParty::Api::Errors::TooManyRequestsError
      count + 1 # i.e. 1s, 2s, 3s
    end
  end

  def perform(user)
    ThirdParty::Api::User.renew(user.external_id)
  end
end

# good - fine-grained control of API usage in jobs
class SomeJob < ApplicationJob
  def perform(user)
    LIMITER.within_limit do
      ThirdParty::Api::User.renew(user.external_id)
    end
  end
end

# config/initializers/sidekiq.rb
Sidekiq::Limiter.configure do |config|
  config.errors << ThirdParty::Api::Errors::TooManyRequestsError
end

Infrastructure

One Process per Core

On multi-core machines, run as many Sidekiq processes as needed to fully utilize cores. Sidekiq process only uses one CPU core. A rule of thumb is to run as many processes as there are cores available.

Redis Memory Constraints

Redis’s database size is limited by server memory. Some prefer to explicitly set maxmemory, and in combination with a noeviction policy, this may result in errors on job scheduling.

Dead Jobs

Do not keep jobs in Dead Jobs. With extended backtrace enabled for Dead Jobs, a single dead job can occupy as much as 20KB in the database.

Re-run the jobs once the root cause is fixed, or delete them.

Excessive Arguments

Do not pass an excessive number of arguments to a job.

# bad
SomeJob.perform_later(user_name, user_status, user_url, user_info: huge_json)

# good
SomeJob.perform_later(user, user_url)

Hordes

Do not schedule hundreds of thousands jobs at once. A single job with no parameters takes 0.5KB. Measure the exact footprint for each job with its arguments.

Monitoring

Monitor the server and store historical metrics. Properly configured metrics will provide answers to improve the throughput of job processing.

Commercial Features

Some commercial features are available as third-party add-ons. However, their reliability is in most cases questionable.

Use Batches

Group jobs related to one task using Sidekiq Batches. Batch’s jobs method is atomic, i.e., all the jobs are scheduled together, in an all-or-nothing fashion.

# bad
class BackfillMissingDataJob < ApplicationJob
  def self.run_batch
    Model.where(attribute: nil).find_each do |model|
      perform_later(model)
    end
  end

  def perform(model)
    # do the job
  end
end

# good
class BackfillMissingDataJob < ApplicationJob
  def self.run_batch
    batch = Sidekiq::Batch.new
    batch.description = 'Backfill missing data'
    batch.on(:success, BackfillComplete, to: SysAdmin.email)
    batch.jobs do
      Model.where(attribute: nil).find_each do |model|
        perform_later(model)
      end
    end
  end

  def perform(model)
    # do the job
  end
end

Self-scheduling Jobs

Avoid using self-scheduling jobs for long-running jobs. Prefer using Sidekiq Batches to split the workload.

# bad
class BackfillMissingDataJob < ApplicationJob
  SIZE = 20
  def perform(offset = 0)
    models = Model.where(attribute: nil)
      .order(:id).offset(offset).limit(SIZE)
    return if models.empty?

    models.each do |model|
      model.update!(attribute: for(model))
    end
    self.class.perform_later(offset + SIZE)
  end
end

# good
class BackfillMissingDataJob < ApplicationJob
  def self.run_batch
    Sidekiq::Batch.new.jobs do
      Model.where(attribute: nil)
        .find_in_batches(20) do |models|
        BackfillMissingDataJob.perform_later(models)
      end
    end
  end

  def perform(models)
    models.each do |model|
      model.update!(attribute: for(model))
    end
  end
end

API Rate-limited Operations

Most third-party APIs have usage limits and will fail if there are too many calls in a period. Use rate limiting in jobs that make such external calls.

Never rely on the number of jobs to be executed. Even if you schedule jobs to be executed at a specific moment, they might be executed all at once, due to, e.g., a traffic jam in job processing. Use Enterprise Rate Limiting. Use the strategy (Concurrent, Bucket, Window) that is most suitable to the specific API rate limiting.

# bad
class UpdateExternalDataJob < ApplicationJob
  def perform(user)
    new_attribute = ThirdParty::Api.get_attribute(user.external_id)
    user.update!(attribute: new_attribute)
  end
end

User.where.not(external_id: nil)
  .find_in_batches.with_index do |group_number, users|
  users.each do |user|
    UpdateExternalDataJob
      .set(wait: group_number.minutes)
      .perform_later(users)
    end
end

# good
class UpdateExternalDataJob < ApplicationJob
  LIMITER = Sidekiq::Limiter.window('third-party-attribute-update', 20, :minute, wait_timeout: 0)

  def perform(user)
    LIMITER.within_limit do
      new_attribute = ThirdParty::Api.get_attribute(user.external_id)
      user.update!(attribute: new_attribute)
    end
  end
end

# Application code
User.where.not(external_id: nil).find_each do |user|
  UpdateExternalDataJob.perform_later(user)
end

# config/initializers/sidekiq.rb
Sidekiq::Limiter.configure do |config|
  config.errors << ThirdParty::Api::Errors::TooManyRequestsError
end

Default Limiter Backoff

Do not rely on Sidekiq’s limiter backoff default. It will reschedule the job in five minutes in the future.

DEFAULT_BACKOFF = ->(limiter, job) do
  (300 * job['overrated']) + rand(300) + 1
end

It doesn’t fit the cases when limits are released quickly or are kept for hours. Configure it on a limiter basis.

Sidekiq::Limiter.configure do |config|
  config.backoff = ->(limiter, job) do
    case limiter.name
    when 'daily-third-party-api-limit'
      12.hours
    else
      (300 * job['overrated']) + rand(300) + 1 # fallback to default
    end
  end
end

Keep in mind how limiter comparison works. Compare limiters by the name, not by the object.

 Sidekiq::Limiter.bucket('custom-limiter', 1, :day) == Sidekiq::Limiter.bucket('custom-limiter', 1, :day) # => false

Reuse Limiters

Create limiters once during startup and reuse them. Limiters are thread-safe and designed to be shared.

Each limiter occupies 114 bytes in Redis, and the default TTL is 3 months. 1 million jobs a month using non-shared limiters will be constantly consuming 300MB in Redis.

# bad - limiter is re-created on each job call
class SomeJob < ApplicationJob
  def perform(...)
    limiter = Sidekiq::Limiter.concurrent('erp', 50, wait_timeout: 0, lock_timeout: 30)
    limiter.within_limit do
      # call ERP
    end
  end
end

# good
class SomeJob < ApplicationJob
  ERP_LIMIT = Sidekiq::Limiter.concurrent('erp', 50, wait_timeout: 0, lock_timeout: 30)

  def perform(...)
    ERP_LIMIT.within_limit do
      # call ERP
    end
  end
end

# acceptable - an exception is when the limiter is specific to something, and that is used as a distinction key in limiter name.
class SomeJob < ApplicationJob
  def perform(user)
    # Rate limiting is per user account
    user_throttle = Sidekiq::Limiter.bucket("stripe-#{user.id}", 30, :second, wait_timeout: 0)
    user_throttle.within_limit do
      # call stripe with user's account creds
    end
  end
end

Limiter Options

The usage of incorrect limiter options may break its behavior.

wait_timeout

Set wait_timeout to zero or some reasonably low value. Doing otherwise will result in idle workers, while there might be jobs waiting in the queue.

Keep in mind the backoff configuration, and carefully pick the timing when the job is retried.

lock_timeout for Concurrent Limiter

Set lock_timeout to a longer than the job executes. Otherwise, the lock will be released too early and more concurrent jobs will be executed than expected.

Global Limiting Middleware

The Sidekiq::Limiter::OverLimit exception might be rescued by jobs to discard themselves from locally defined limiters. To avoid interference between global throttle limiter middleware and local job limiters, wrap Sidekiq::Limiter::OverLimit exception in middleware.

# Middleware
class SaturationLimiter
  SaturationOverLimit = Class.new(StandardError)

  def self.wrapper(job, block)
    LIMITER.within_limit { block.call }
  rescue Sidekiq::Limiter::OverLimit => e
    limiter_name = e.limiter.name
    # Re-raise if an over the limit exception is coming from a limiter
    # defined on the job level.
    raise unless limiter_name == LIMITER.name

    # Use a custom exception that Sidekiq::Limiter is using to re-schedule
    # the job to a later time, but in a way that doesn't overlap with the
    # limiters defined on the job level.
    raise SaturationOverLimit, limiter_name
  end
end

# config/initializers/active_job.rb
ActiveJob::Base.around_perform(&SidekiqLimiter.method(:wrapper))

Ignore OverLimit Exceptions on Third-party Services

Sidekiq::Limiter::OverLimit is an internal mechanism, and it doesn’t make sense to report when it triggers.

# config/initializers/rollbar.rb
Rollbar.configure do |config|
  config.exception_level_filters.merge!('Sidekiq::Limiter::OverLimit' => 'ignore')
end
# config/newrelic.yml
production:
  error_collector:
    enabled: true
    ignore_errors: "Sidekiq::Limiter::OverLimit"

Rolling Restarts

Use Enterprise Rolling Restarts. With Rolling Restarts, deployments do not suffer from downtime. Also, it prevents non-atomic and non-idempotent jobs from being interrupted and executed more than once on deployments.

Warning
For Capistrano-style deployments make sure to use --reexec-as and --drop-env-var BUNDLE_GEMFILE einhorn options to avoid stalled code and dependencies.

Testing

perform

Don’t use job.perform or job_class.new.perform, it bypasses the Active Job serialization/deserialization stage. Use job_class.perform_now. With the implicitly subject, and recommends against using .perform (that as you correctly mention is exclusively available on a job instance, not class):

# bad - `perform` method is called directly on an implicitly defined subject
RSpec.describe SomeJob do
  # implicitly defined `subject` is `SomeJob.new`
  it 'updates user status' do
    expect { subject.perform(user) }.to change { user.status }.to(:updated) }
  end
end

# bad - `perform` method is called directly on a job instance
RSpec.describe SomeJob do
  it 'updates user status' do
    expect { SomeJob.new.perform(user) }.to change { user.status }.to(:updated) }
  end
end

# good
RSpec.describe SomeJob do
  it 'updates user status' do
    expect { SomeJob.perform_now(user) }.to change { user.status }.to(:updated) }
  end
end

perform_later

Prefer perform_now to perform_later when testing jobs. It doesn’t involve Redis.

# bad - unnecessary roundtrip to Redis
RSpec.describe SomeJob do
  it 'updates user status' do
    expect do
      SomeJob.perform_later(user)
      perform_scheduled_jobs
    end.to change { user.status }.to(:updated) }
  end
end

# good
RSpec.describe SomeJob do
  it 'updates user status' do
    expect { SomeJob.perform_now(user) }.to change { user.status }.to(:updated) }
  end
end

History

This guide came to life as an internal company list of the best practices of working with ActiveJob and Sidekiq. It is compiled from remarks collected from numerous code reviews, and during the migration from another background job processing tool to Sidekiq. Initially created by Phil Pirozhkov) with the help of colleagues, and sponsored by Toptal.

Contributing

The guide is a work in progress. Improving such guidelines is a great (and simple way) to help the Ruby community!

Nothing written in this guide is set in stone. We desire to work together with everyone interested in gathering the best practices of working with background jobs. The goal is to create a resource that will be beneficial to the entire Ruby community.

Feel free to open tickets or send pull requests with improvements. Thanks in advance for your help!

How to Contribute

It’s easy, just follow the contribution guidelines below:

  • Fork on GitHub

  • Make your feature addition or bug fix in a feature branch.

  • Include a good description of your changes

  • Push your feature branch to GitHub

  • Send a Pull Request

Spread the Word

A community-driven style guide is of little use to a community that doesn’t know about its existence. Tweet about the guide and share it with your friends and colleagues. Every comment, suggestion, or opinion we get makes the guide just a little bit better. And we want to have the best possible guide, don’t we?

More Repositories

1

gitignore.io

Create useful .gitignore files for your project
Swift
8,174
star
2

haste-server

open source pastebin written in node.js
JavaScript
2,885
star
3

keycodes

Easy visualizer for JavaScript KeyCodes
TypeScript
2,153
star
4

chewy

High-level Elasticsearch Ruby framework based on the official elasticsearch-ruby client
Ruby
1,857
star
5

webdevchecklist.com

Web Developer Checklist
HTML
1,802
star
6

gitignore

The largest collection of useful .gitignore templates
1,653
star
7

haste-client

CLI client for haste-server
Ruby
625
star
8

crystalball

Regression Test Selection library for your RSpec test suite
Ruby
321
star
9

granite

Business Actions architecture for Rails apps
Ruby
168
star
10

picasso

Toptal UI components library
TypeScript
121
star
11

xene

🤖 Modern library with simple API to build great conversational bots.
TypeScript
66
star
12

jvm-monitoring-agent

Monitor JVM from within, detect thread blocks and automatically save threads dump
Java
50
star
13

codeowners-checker

Check .github/CODEOWNERS consistency
Ruby
49
star
14

webpack-assets

Webpack Assets for Rails
Ruby
42
star
15

BestPracticesChromeExtension

Web Developer Checklist
JavaScript
37
star
16

component-resolver-webpack

Webpack plugin that simplifies process of components loading
JavaScript
32
star
17

chewy_example

Chewy example application
Ruby
32
star
18

test-distrib

Ruby
25
star
19

license-cop

A nifty script that fetches the licenses for all your third-party libraries
Python
24
star
20

disqus_api

Disqus API for ruby
Ruby
23
star
21

spring-commands-rubocop

RuboCop command for Spring
Ruby
19
star
22

chai-react-suite

JavaScript
16
star
23

archfiend

A basic daemon generator
Ruby
8
star
24

davinci-github-actions

Reusable Github Actions for davinci based applications
JavaScript
8
star
25

jenkins-job-trigger-action

GitHub Action for triggering Jenkins Jobs and wait build result.
Ruby
7
star
26

trixie

CLI tool to fetch secrets in development
Ruby
6
star
27

rspec-any_of

any_of/all_of argument matcher for RSpec
Ruby
5
star
28

eslint-config-toptal

Shared eslint config for Toptal projects
JavaScript
5
star
29

example_granite_application

Example application for Granite framework
Ruby
4
star
30

rack-rake_task

A simple Rack middleware that allows execution of Rake tasks via the HTTP request
Ruby
2
star
31

slack-mass-messenger

Allows the user to send personalized message to multiple Slack users so it appears on Slack as sent by the user themselves.
Python
2
star
32

topcall-release

1
star
33

tracker-api-load-tester

JavaScript
1
star
34

gitignore.io-docs

Gitignore.io documentation
1
star
35

granite-form

Form builder for granite based on ActiveData
Ruby
1
star
36

rails_commander

Programmatical (Ruby) wrapper to Rails' CLI
Ruby
1
star