Sidekiq (Infinitely) Iterable Jobs

Jon Sully headshot

Jon Sully

@jon-sully

We wrote in our last post all about Sidekiq 7.3’s new feature, iterable jobs — how to use them, logical systems they may be useful in, and where to avoid using them… but we had another idea along the way.

Earlier this year we wrote a blog post called “How to Run Code (Safely) on Repeat Forever”, all about different patterns for executing repetitive code safely forever. This could be some kind of scanning job that you simply need to run every (for example) 10 seconds. It could be an import processor that runs every (example) 30 seconds. The specifics aren’t as important; the idea is simply that we wanted to illustrate a few strategies for safely repeating code forever, and compare how those strategies stack up.

Just for a brief recall, we ultimately illustrated three strategies.

The first was the self re-enqueuing background job. The simple idea here being that once your work is completed in a background job, that very same background job kicks off another copy of itself. This forms the infinite chain which should keep the logic running:

Self re-enqueuing background job

E.g., something like this:

# ~/app/jobs/cycle_job.rb

class CycleJob
  include Sidekiq::Worker

  def perform
    results = SomeDatabaseQuery.run
    aggregated_data = Aggregator.call(results)
    AggregatedStuff.insert(aggregated_data)

    CycleJob.perform_async
  end
end

And the tl;dr: for this strategy is that getting the timing just right (if you want to run exactly 10 seconds apart, not 12-13, etc.) can be challenging, random errors breaking the chain can be devastating, and there can even be odd occurrences where the chain forks and you end up with two chains going. We ran this setup for quite a while but ended up leaving it after many of these headaches. Here be dragons!

The second strategy was the (longly-titled) scheduled background job with global lock. The idea with this one is that you run an external (its own process/dyno/container) job scheduling system that’s responsible for kicking off (not executing!) jobs on a cyclical schedule. Then the job logic itself ensures a global lock on some resource to ensure that multiple copies of itself can’t run at the same time. Visually, that looks like this:

global lock

And the code might look something like this:

# ~/clock.rb

class Clock
  include SomeScheduleFramework

  every 2.seconds { CycleJob.perform_async }
  every 5.minutes { SomeOtherJob.perform_async }
end

# ~/app/jobs/cycle_job.rb

class CycleJob
  include Sidekiq::Worker

  def perform
    # Lock against other attempts
    return unless Rails.redis { |r| r.set "cycle-job-lock", "busy", nx: true }

    results = SomeDatabaseQuery.run
    aggregated_data = Aggregator.call(results)
    AggregatedStuff.insert(aggregated_data)

    # Unlock for next pass
    Rails.redis { |r| r.del redis_key }
  end
end

While this approach requires more infrastructure (a dependency on Redis and a dedicated Clock process), the tl;dr: on this one is that it’s pretty safe and reliable. This is the pattern we moved to after years of the self re-enqueuing job. Since we wrote the first article, we haven’t had any issues with this pattern. Neato! Not quite “throw money at the problem”, but close enough 😁

The third strategy we discussed in that article is what we dubbed the ‘forever-running rake task’. The idea here is that we skip using a background job system altogether and instead just do the cyclical work in a rake task that loops forever (with a delay of some sort):

forever-running rake task

The code here may be, for example:

# ~/lib/tasks/continuous.rb

namespace :continuous do
  task aggregate: :environment do
    loop do
      results = SomeDatabaseQuery.run
      aggregated_data = Aggregator.call(results)
      AggregatedStuff.insert(aggregated_data)

      sleep 5
    end
  end
end

While this strategy has some interesting tradeoffs in terms of sequential execution guarantees (which are doable with background jobs and global locks, but obviously trickier), the tl;dr: is that it’s extremely wasteful, resource wise, and that it doesn’t scale well. If you’re renting cloud server time to run this process, the fact that it sleeps for the majority of its lifecycle is truly just wasted money 💸. Additionally, if you have multiple different workloads that need to run continuously, you’d need to create multiple of these jobs and multiple new processes in your Procfile, thus spinning up multiple dynos/containers which will burn money over time. Not ideal!

And that was essentially the article. Three ways to run code on repeat forever; option two seemingly being the safest and best. That’s what we’ve been running our continuous code with since — four different jobs that run on ten second intervals and one that runs one a one second interval. That’s a lot of continuous code.

So while that’s all been working fine and we don’t have any need to change things, an idea popped into our minds while writing our last article. What if you set up a Sidekiq iterable job where the enumerable is simply infinity. Will that, in essence, run your code safely forever?

A picture is worth a thousand words, so let me illustrate with a code sample that ought to accomplish this hack:

class ProductImageChecker
  include Sidekiq::IterableJob

  def build_enumerator(*args, **kwargs)
    Enumerator.new do |yielder|
      loop do
        yielder.yield
      end
    end
  end

  def each_iteration(*args)
    # Same example as above; some repeating work example:

    results = SomeDatabaseQuery.run
    aggregated_data = Aggregator.call(results)
    AggregatedStuff.insert(aggregated_data)

    sleep 5 # slow down the iteration
  end
end

The each_iteration bit is pretty straightforward, but let’s talk about that Enumerator.new block! We know that build_enumerator needs to return some kind of enumerator, but rather than returning a set of records or other array-style object, we create a new Enumerator from scratch. Inside that definition (the do |yielder|) we setup an infinite loop (loop do). This essentially represents the content of the Enumerator we’re creating. Then, inside that loop, we’re simply yielding. In context, that means the Enumerator can be enumerated endlessly, and each time it’s enumerated it simply yields control to the caller. Or, in an even higher context, we get the equivalent of an infinitely long array without any objects inside! That includes the benefit of iterating forever, but also includes the benefit of not having to setup any objects or data for it! Neat.

If we use the same rubric from our last blog post to determine the pros and cons of this new pattern, we need to ask a few questions.

Does it satisfy the “run often” premise? Yes, most definitely. Sidekiq will churn away at the iterable job loop, under its own management, as fast as it can. So much so that we actually need to bake a speed limit into the iteration by adding a sleep call!

Now, you’d be correct if you thought back to the sleep call we added in the forever-running rake task strategy and how that lead to a conclusion of “it wastes resources!” On its nose, it might feel like the sleep call here will lead to wasted resources too. The one redeeming quality here is that, since this is happening in a Sidekiq job and Sidekiq is a multi-threaded job processing system, other threads will be able to work on other jobs while this job is sleeping! It’s not a total redemption, but it’s pretty close.

Does it satisfy the “don’t run multiple copies at a time” premise? Absolutely, and it’s not even our responsibility anymore! With the iterable jobs feature, Sidekiq itself now holds the responsibility for making sure these jobs run, continuing to run them as long as there are values left in the enumerator (yet to be processed), and ensuring that all of the work happens sequentially and not in parallel! We don’t have to write any of our own code for that in this new pattern.

This new pattern has a bit of magic elegance to it, too. The secret sauce is all in that build_enumerator method. Explaining that a bit, let’s first notice that we’re returning a custom enumerator from the method. This is the object that Sidekiq will ultimately call .each on to sequence through the objects in the enumerable. In our case, the magic is that we’re essentially spoofing a set of objects and instead just running an infinite loop. A perfect endless loop that spoofs Sidekiq into thinking that the set of objects never ends! Seamless.

Are there any downsides? Well, the interesting thing about this approach is that it’s broadly somewhat similar to the self re-enqueuing job pattern (#1 above) but it solves many of the issues there. By Sidekiq itself essentially taking responsibility for the self-re-enqueuing, we now have guarantees from Sidekiq around the chain never getting broken and staying in a proper sequence indefinitely. Those two things were actually the only issues we had with the self re-enqueuing pattern. Having them solved out-of-the-box is pretty incredible! So aside from the aforementioned sleep call efficiency (again, multi-threading makes this mostly a non-issue), we don’t foresee any downsides with this approach!

…Are we going to switch? As we mentioned, we’ve been running the clock-process approach (#2 above) for the better part of a year now with zero issues, and we are pretty happy with it. Now, it does require more stuff than this new approach — a clock process and encoded schedule for kicking off jobs — but once we set those things up we haven’t really thought about them since. This new pattern does feel simpler though, and that’s always an alluring goal. TBD for us! We may give it a shot with one of our repeating workloads in the coming weeks to see how it does. A test of sorts, if you will.

Anyway, that’s all for this post! Hopefully this new concept and idea sparks something in your mind and you find a neat application for it, too! We’d love to hear back any experiences you, dear readers, have with the iterable jobs setup! Shoot us an email! They go straight to our small team of (all human) devs and we personally read all of them.