If you’ve built, run, or maintained a Rails app for any extended amount of time, you’ve likely hosted on Heroku. And if you’ve hosted Rails on Heroku, you’ve probably run into the dreaded R14 Memory quota exceeded error:
Or worse, the R15 Memory quota vastly exceeded error, which will kill and restart your dyno immediately (meaning a few seconds of downtime)!
…and we’d know. The screenshots above are from one of our worker processes a few weeks ago. Nice! 😎
The thing is, it’s tricky to write a universal guide for lowering your memory footprint. An application’s memory usage tends to be tightly linked to what the app does and its domain model. An application that allows uploads of Excel files is going to have a very different memory outlook than a simple, efficient CRUD app. But hey, we’re going to try!
Let’s start with some of the universal advice that should apply to all applications, though.
Adjusting Processes
Tweaking our process count and/or thread count is itself a bit of a tricky business that requires some understanding. The rule of thumb for processes has traditionally been “use the number of cores you have”. But Heroku doesn’t tell us how many cores we have! Not on shared hardware, at least. We wrote about this in “Heroku Dynos: Sizes, Types, and How Many You Need”, but the gist of it is that Heroku is fairly opaque in how they describe the number of CPU cores available about each dyno type:
For Standard-1X and Standard-2X dynos we have “1x” and “2x” CPU Share respectively, and similar relative multipliers for Compute, but this isn’t as clear as saying “You get two cores”. Nonetheless, we’ve recommended for years that, for web dynos, we ought to use WEB_CONCURRENCY = 1 for Standard-1X dynos and 2 for Standard-2X dynos. This falls in line with the “CPU Share” values, even though those don’t necessarily map to dedicated core counts.
Further, since it’s not clear across most of Heroku’s documentation, we believe that Performance-M‘s are also 2-core and that Performance-L’s are 8-core. So we recommend WEB_CONCURRENCY of 2 and 8, respectively, for those.
👀 Note
WEB_CONCURRENCY is the default environment variable for telling Puma how many processes to fork, as defined in the default Puma config file:
# If you are running more than 1 thread per process, the workers count# should be equal to the number of processors (CPU cores) in production.workersInteger(ENV.fetch("WEB_CONCURRENCY"))
But these are all simply guidelines for how many web process a fresh application ought to run when getting onto Heroku. What if that’s too much memory usage for the given dyno size?
We can either change the dyno type we’re using (and/)or change the number of processes we’re running.
Let’s start with the smallest example. Let’s say we’re running a simple Rails application on a Standard-1X dyno and experiencing memory issues. For starters, we might think that since WEB_CONCURRENCY is already set to 1, we can’t go lower there. That is the intuitive answer, but we actually can! WEB_CONCURRENCY=0 runs Puma in a slightly different mode that can save some memory. If it’s the little bump you need to get under your memory quotas, great! If it’s not, then we need to look at upgrading to Standard-2X dynos where we’ll get more memory space (and can hopefully use WEB_CONCURRENCY=2).
On the other hand, what if we’re running an app already on Standard-2X dynos with WEB_CONCURRENCY=2? If other memory strategies (read the rest of this guide and part 2 first!) aren’t helping enough, we can always run Standard-2X dynos with WEB_CONCURRENCY knocked down to 1. We’ll get less overall throughput capacity, but that may solve our memory problem. And, of course, we could upgrade to Performance-M’s where we’ll have roughly similar CPU capacity but more than double the memory space to work with. Both of these options bring their own costs and benefits!
So, ultimately what we need to remember here is that dropping our WEB_CONCURRENCY count is a real strategy for alleviating memory pressure. It’s a bit of a sledge-hammer / two-edged-sword because we can lose throughput and capacity with that change, but we’ve got to find a happy medium somewhere!
✅ Tip
We’re mostly leaving background-job processing out of the Process count discussion since most background job systems for Rails (including Sidekiq in all-but-enterprise options) use a single process per dyno. For that reason, we recommend running background job processes on Standard-1X dynos and alleviating any memory pressure that might come from them with other strategies described below.
Adjusting Threads
If process counts are one side of a coin, thread counts are the other. Unfortunately, adjusting the thread count for Ruby processes is a big bag of worms. At its root, Ruby is a single-threaded runtime language. But it can operate asynchronously by switching between multiple threads of work while one waits. We covered this concept extensively in “Why Did Rails’ Puma Config Change?!”, but suffice it to say that we have two levers. If we decrease our thread count we may save a little bit of memory, but our process will now be able to handle fewer jobs or web requests per second. If we increase our thread count our memory may go up a bit and we will be able to handle more requests / jobs per second on that process, but there will be outliers.
In general, we recommend sticking with Rails’ (new) default of 3 Puma threads per web process. If you’re running on a Standard-1X dyno with WEB_CONCURRENCY=0 and need just a little more memory savings, feel free to experiment with a thread count of 2, but understand that you’ll be losing some capacity… so you might end up running more Standard-1X‘s anyway!
On the worker side, especially when using a Standard-1X for Sidekiq dynos as previously mentioned, we err on the side of Sidekiq’s default here too: 5 threads. Since background jobs have a tendency to be a bit more I/O heavy than synchronous web requests (and since their performance is likely less impactful than web responses), 5 feels like a good number. If you’re experiencing memory issues on your dynos while running 5 threads per Sidekiq process, you may have other issues afoot! We’ll cover these more in part 2.
Use Jemalloc
If you’re not currently using Jemalloc on your application, I have exciting news. Enabling Jemalloc can alleviate a lot of memory pressure with zero downside, about 2 minutes of effort, and no cost. It’s great, it’s free, and it’s easy. There is no reason to not use Jemalloc — and many in the Ruby community wish that it would be set as the default for Ruby. That’s a discussion for another time, but needless to say, run Jemalloc on your app and get free memory savings! Just add the buildpack to your app and set the ENV var: JEMALLOC_ENABLED=true.
It’s hard to overstate how impactful simply switching to Jemalloc can be for a typical Ruby / Rails application’s memory footprint. There are tons of articles out there covering this, so we’ll leave it to a simple affirmation of “yes, you should be doing this!”
Consider NOT Using YJIT
Lastly, in the case that your application is running Ruby’s YJIT system, consider… not doing that. As Ruby has matured and YJIT has increasingly been adopted for essentially-free performance benefits (and this is true 👏), the one thing that’s not free about it is that your application will use more memory when YJIT is active. After all, the ‘compiled’ code has to be stored somewhere! We’ve personally noted memory increases of around 10% when running YJIT. We hate to advise disabling YJIT, but keep in mind that it is an option if you’re facing memory issues!
Upgrade Sidekiq
One last note for this post — if you’re running Sidekiq and seeing background job memory levels too high, start with simply upgrading your Sidekiq version! We found excellent improvements to memory stability and management in Sidekiq 7, particularly. It’s an easy win — give it a shot!
These four strategies are, unfortunately, where the general direction for all applications ends! Beyond these, you’ve got to put some boots on the ground yourself. Let’s dive into Part 2 for some ideas and strategies for app-specific tools and ways to bring down memory.