The Ultimate Guide to Scaling Sidekiq

Adam McCrea
@adamlogicđ Note
Editorâs note: Adam first drafted and published this article on Sidekiqâs own Wiki after chatting with Mike Perham about the value of adding docs to Sidekiq specifying guidance around actually scaling Sidekiq once itâs running in production. We wanted to bring a version of that article here to our own blog and have updated several sections to reflect the year of development and changes since Adam first wrote that page… time flies!
Sidekiqâs architecture makes it easy to scale up to thousands of jobs per second and millions of jobs per day. Scaling Sidekiq can simply be a matter of âadding more serversâ, but how do you optimize each server, how âbigâ do the servers need to be, and how do you know when to add more? Those are the questions this guide will answer.
Concepts and terms
Before we dive into concrete guidance and advice, letâs start with an overview of Sidekiqâs architecture and the various âleversâ we have available to us. Weâll also define some terms weâll use throughout this guide.
- Concurrency - The Sidekiq setting that controls the number of threads available to a single Sidekiq process.
- Swarm - A feature of Sidekiq Enterprise that supports running multiple Sidekiq processes on a single container.
- Container - A container instance running one or more Sidekiq processes. You might call this a server, service, dyno, pod, etc. Weâll just call them containers.
- Total concurrency - The total number of Sidekiq threads across all containers and processes.
Hereâs a diagram that shows the relationship between these concepts:
Sidekiq is all about queues, of course, so letâs clarify some terms here too.
- Queues - You put your jobs into queues (which live in Redis), and Sidekiq processes the jobs in the queue, oldest first (FIFO). When starting a Sidekiq process, you tell it which queues to monitor and how to prioritize them.
- Queue assignment - You can assign queues (or groups of queues) to specific Sidekiq processes, or you can have a single queue assignment used by all Sidekiq processes.
- Queue priority - When assigning multiple queues to a process, Sidekiq has a couple fetch algorithms that dictate how it pulls jobs from those queues: strict and weighted. Weâll call those the queue priority.
And finally we have our connection pools. Yes, multiple connection pools.
- Database connection pool - A pool of database connections shared by all Sidekiq threads within a process. Since a Sidekiq process is ultimately spinning up a full Rails stack, the database connection pool for a Sidekiq process is actually just the same connection pool configured for each Rails process â dictated by
database.yml
. - Redis connection pool - A pool of Redis connections shared by all Sidekiq threads and Sidekiq internals within a process. This is managed by redis-client and is configured automatically by Sidekiq based on your concurrency.
This covers our terms and concepts, and while there are quite a few, the good news is that many of them are handled for us! The rest are straightforward to configure ourselves given an ounce of understanding. Letâs dive in!
A Sidekiq starting point
These are some general recommendations that will help things run smoothly in the beginning of an app and prepare you to scale later.
The fewer queues the better. Donât make your life harder than it needs to be. Two or three queues are plenty for a new app. Weâll talk later about when it makes sense to add more queues, but scaling will generally be more challenging the more queues you have.
Name your queues based on priority or urgency. Some teams name their queues using domain specific terms that are no help at all when it comes to planning queue priority or latency requirements. âUrgentâ, âdefaultâ, and âlowâ are much easier to work with. You might take a step further and embrace our recommendation of latency-based queue names such as âwithin30secondsâ, âwithin5minutesâ, etc. This approach makes it very clear which queues have priority and when queue latency is unacceptable.
Keep your jobs as small as possible! Fan out large jobs into many small jobs. Smaller jobs are much easier to scale, but weâll talk later about strategies to use when this isnât possible.
Run a single Sidekiq process per container. You can add Sidekiq Swarm later, but donât assume youâll need it. This is one less variable to juggle when scaling. Keep it simple.
Choose a container size based on memory. If youâre working with a lot of large files, such as generating PDFâs or importing large CSV files, youâll need more memory. If youâre not doing that, you can probably get away with 1GB or less.
Start with five threads per process (concurrency). This is just a starting point â you will need to tweak it. Many teams get too ambitious with their concurrency, saturating their CPU and slowing down all jobs. The good news is five is the Sidekiq default, so if you donât do anything, youâll have a good starting point.
These guidelines will get you started, but what about optimizing your configuration and scaling beyond the basics? Thatâs what weâll tackle in the following sections.
Find your concurrency sweet spot
Depending on your container CPU and the type of work your jobs are doing (mainly the percentage of time spent in I/O), youâll probably need to tweak your concurrency setting. As a very simple rule, you want to CPU usage to be high but not 100% when all threads are in use.
If CPU is hitting 100%, you need to reduce your concurrency. If your CPU usage never goes above 50% at max throughput, you probably want to increase your concurrency.
There are several strategies for changing your Sidekiq concurrency value. The simplest of which is to change your RAILS_MAX_THREADS
value, as Sidekiq will âlisten toâ and respect this value. You should take caution here, though, if youâre changing the value globally in a way that will also change your web
Rails processes, as they too will âlisten toâ a changed value. Accidentally changing your web
threads count when you intended to only change your workers
thread count can make for a bad afternoon!
That said, since Railsâ database pool value is, by default, configured to adhere to RAILS_MAX_THREADS
, we nonetheless recommend changing your Sidekiq concurrency default by changing RAILS_MAX_THREADS
… just ensure that the value is only changed for the Sidekiq process(es), not your Rails web processes!
The easiest way to do this is illustrated here â using some fancy environment variable overriding!
Autoscale your Sidekiq containers
The simple truth when it comes to âhow many containers?â is that you shouldnât waste your energy calculating how many containers you need to run. Sidekiq loads are highly variable by nature, and you donât want to pay for a cluster of 10 containers when no jobs are enqueued. Autoscaling solves this problem by automatically scaling your containers up and down according to load… but what metric should you use for autoscaling?
Sidekiq workloads are more often I/O-bound than CPU-boundâin other words, you can easily encounter a queue backlog even when CPU utilization is low. This makes CPU an inappropriate and frustrating metric to use for autoscaling, even though itâs the most commonly-used metric used by tools like AWS CloudWatch.
Instead, you should autoscale your Sidekiq containers using queue latency. Your business requirements will have an implicit (or hopefully explicit) expectation how long each job can reasonably wait before being processed. This expectation makes queue latency the perfect metric for autoscaling. (And if youâre using latency-based queue names, youâve already identified those latency expectations!)
Queue-time based autoscaling is the reason we built Judoscale, and we believe itâs the best autoscaling system for Sidekiq. Judoscale is a team of two-and-a-half Rails developers (including a Rails core team member!) and weâve got extensive Sidekiq experience. Judoscale itself runs on Sidekiq! Join over a thousand other teams running Judoscale in production â itâs free for most applications!
Assign queues to dedicated processes
Sometimes it makes sense to add a queue for a specific job or a particular âshapeâ of job. Some examples:
- If youâre unable to break down large jobs into smaller jobs, you might not want those long-running jobs to become a bottleneck in your queue.
- If you have some jobs that use lots of memory, you might need a larger container for those jobs.
- If you have jobs that canât be processed in parallel, you might need those jobs on a dedicated queue that run single-threaded.
These arenât ideal scenarios, but theyâre real-world scenarios that many apps will encounter. Itâs best to treat these queues as the anomalies they are and dedicate them to their own Sidekiq process. This way your long-running jobs will only block other long-running jobs, and your memory-hungry jobs wonât require all of your jobs to run on larger, higher-priced containers.
This isolation makes scaling easier because youâre scaling your âspecialâ queues separately from your ânormalâ queues. Hereâs what it might look like in a Procfile
, using RAILS_MAX_THREADS
to force the memory-hungry jobs to be processed single-threaded (reducing memory bloat):
web: bundle exec rails s
worker: bundle exec sidekiq -q within_30_sec -q within_5_min -q within_5_hours
worker_high_mem: RAILS_MAX_THREADS=1 bundle exec sidekiq -q high_mem
Scaling problems & solutions
The best way to make scaling easy is by keeping it simple: a few queues with small jobs. But of course keeping it simple isnât always easy, especially in a legacy codebase or a large team. Here are some of the problems or anti-patterns youâll generally want to avoid:
- Not enough connections in your database pool. If youâre seeing the dreaded
ActiveRecord::ConnectionTimeoutError
in your Sidekiq jobs, chances are youâve misconfigured your database connection pool. We wrote a whole article about how to solve this one too! Donât worry; itâs a short article. - ERR max number of clients reached. Unlike the error above, this error is coming from Redis, and it usually means youâre using a Redis service with an extremely limited number of connections available. You can either upgrade your Redis service or reduce your concurrency setting.
- Slow job performance / saturated CPU. These go hand-in-hand when youâve set your concurrency too high. Reduce your concurrency or use a more powerful container.
- Sporadic queue backlogs. Most apps have extremely variable load patterns for background jobs. If you donât have autoscaling in place, youâll need to run more containers to avoid these backlogs.
- Unreliable autoscaling. If youâre not scaling up and down when expected, youâre probably autoscaling based on CPU. Autoscale based on queue latency instead. See Judoscale.
- Memory bloat. If your worker containers are using way more memory than you expect, you can either fix the memory bloat, or isolate those jobs to their own queue and process, potentially processing them single-threaded. Weâve written a few posts over the last year on memory that may prove helpful to you here:
- Upstream (database) slow-down. Itâs easy to scale Sidekiq to the point that youâre overloading your database. Thereâs no Sidekiq fix hereâyou either need to reduce total concurrency to alleviate DB pressure, upgrade your database, or make your queries more efficient.
Scaling Redis
The short answer is here is that Redis is almost never the problem when scaling Sidekiq. But for very high-scale apps, you might hit the limits of whatâs possible with a single Redis server. The sharding wiki article walks you through some options here, and now Dragonfly might be an even better option.
Just remember that most apps donât need this! Make sure youâve worked through the earlier suggestions and confirmed that Redis is your bottleneck before proceeding down these paths.
Further reading
Nate Berkopec dives deep into many of the ideas discussed above in his excellent book Sidekiq in Practice. He also has an in-depth article that explains the relationship between processes, threads, and the GVL. For more on latency-based queue names, check out Scaling Sidekiq at Gusto.