Rails on Heroku: How to Use Less Memory Pt. 2

“Use Less Memory” Part 1
“Use Less Memory” Part 2 (This page!)

Welcome back for Part 2 of Rails + Heroku memory tips and tricks! In Part 1 we covered some universal basics, but here we want to dig in a little bit. This article is going to contain a lot of suggestions and helpful tidbits, but we can’t dictate concrete advice for any specific application! So as you read on, make sure to think critically about how each of these points relates to your specific application and domain.

Check Your APM Tool

If you’re facing memory issues and are already confident that your settings from Part 1 are satisfied, the first place to start is your Application Performance Monitoring tool. We look at our APM tool all the time. We like Scout but have no real affiliation here — any APM will do.

You just need a reliable tool that can show you a history of your application’s requests, processing, and background job info. For starters, Scout has a “Memory Bloat Insights” tab made specifically for this task — show me the jobs or endpoints that are increasing my memory footprint!

Screenshot of Scout APM’s memory bloat tab showing a few places that spike memory when run

The end game here is simply to identify jobs or endpoints that are causing memory to increase in your application. We’re narrowing our search. If we at least know where the bloat is happening, we can (hopefully) read through the source code of that endpoint/job and start to determine why memory is increasing in those places!

That said, an APM isn’t going to tell you “Ah, you have a memory issue here because you wrote .each here —you should probably use .find_each instead”. At least, not yet! APM’s are great for tracking down where memory bloat is occurring but can’t help you determine why. Let’s go through some of the common ‘why’s, though.

Optimize Queries

We’d wager that seven times out of ten, memory bloat comes from how we use ActiveRecord and/or query for data from our database.

For context, Rails devs should always keep in mind that when we query ActiveRecord from models, the queries generally return model objects. It’s not as if we’re running a SQL query then storing the resulting data as an array in memory. For any query that originates from a Model class (such as Book.last(10) or Article.where(published: true)), ActiveRecord automatically takes every record from the result set of the query and converts each record over to a Ruby object (a model instance), now sitting in memory. That’s why our queries return arrays of objects: ActiveRecord automatically makes Ruby objects from the results of the query.

That’s a double-edged sword, though. Ruby objects, especially ActiveRecord model instances, can be memory-heavy. And if we’re not careful about our queries we can end up with thousands of objects being instantiated into Ruby objects in memory.

✅ Tip

Want to try a little experiment with this yourself? It’s actually quite easy to run with an existing app on Heroku. Just fire up a Heroku run dyno with a Rails console:

$> heroku run bundle exec rails c -a YOUR_APP_NAME

Then pick a model from your app that either a) has a lot of stuff going on in its model code / file (“fat” model), or b) just has a ton of records. Or, if you can, a model that has both!

Finally, run the following sequence of commands to instantiate a ton of those model records into memory:

def memory_usage
  `ps -o rss= -p #{Process.pid}`.to_i / 1024.0
end

puts "Initial memory usage: #{memory_usage} MB"

puts "Running memory-intensive query..."

objects = YOURMODEL.last(20_000)
nil # just a nil command to prevent the prior from outputting to STDOUT in some cases

puts "Memory usage after query: #{memory_usage} MB"

And feel free to play with the 20,000 number. I found that, in my particular test application, I couldn’t load more than about twenty thousand records of my model before my one-off dyno outright crashed and booted me off. In multiple runs I’d see a spike from an initial memory of ~250MB to about 500MB. Ouch!

Being that my one-off dyno was essentially a Standard-1X with only 512MB of memory available, this makes sense. But it goes to show you how loading a few thousand extra records can really make a difference!

But alas, even with all of this preamble, it does tend to be the most-often committed memory mistake on Rails applications. Inadvertently instantiating lots of ActiveRecord objects when you didn’t mean to, or didn’t intend to, can eat up your memory.

So be careful in how you write your queries in your application! Keep in mind that even simple assignments in controller code like:

class SomeController < ApplicationController
  before_action :set_users

  # ...

  private def set_users
    @users = User.where(:active)
  end
end

Will instantiate all of those user records into ActiveRecord objects in memory! We’d recommend getting into pagination when facing unavoidable large queries in your controllers. And, if instead we’re talking about background jobs, just use ActiveRecord’s built-in batching mechanisms: .find_each or .in_batches — both of which only load and instantiate chunks of the resulting query into memory before dumping the used memory and moving onto the next chunk of the query result.

Alternatively, one technique we employ often is to only get ID’s when all we need is ID’s. This is particular applicable when we’re kicking off background jobs for a bunch of records. Sure, we could do this:

User.where(active: true).find_each { |usr| MessageUserJob.perform_async(usr.id) }

Which at least employs batching as mentioned before (via .find_each) so we shouldn’t have massive memory bloat… but we’re instantiating each User object as a full-fledged ActiveRecord only to call .id on it then move on! That’s a waste of memory and processing power! Let’s instead just get the ID’s alone and use them:

User.where(active: true).ids.each { |id| MessageUserJob.perform_async(id) }

Here we’ve skipped the entire instantiation process and instead get back a simple array of integers. Much smaller, much faster, and easy to iterate through and kick off background jobs from 😁

✅ Tip

Pro tip: if you’re using Sidekiq, you should be using bulk enqueuing anyway!

Sidekiq::Client.push_bulk(
"class" => MessageUserJob,
"args" => User.where(active: true).ids.map { [it] } # array of arrays
)

And the same holds true for ActiveJob in general if you’re going that route:

jobs = User.where(active: true).ids.map { MessageUserJob.new(it) }
ActiveJob.perform_all_later(jobs)

Avoid N+1 Queries

N+1 queries are a notorious memory hog. While the issues described above create a single huge spike in memory as thousands of objects are instantiated all at once, N+1 queries create a more nuanced memory problem. In short, N+1 queries can hide their memory bloat since the memory fragmentation gets spread out more. Instead of one big batch of objects that’s taking up memory, we end up with N different small queries, each instantiating objects in different memory locations.

We’re not going to extensively explain what an N+1 query is here (there are tons of great articles on that!) but we do want to explain how to find one!

It might not surprise you that we’re going to point back to our APM tool for initial suspicions here.

Screenshot of Scout APM’s N+! tab showing a few places that apparently have N+1 query issues

Most APM tools are going to have some kind of integration capability for revealing locations in our applications that appear to have an N+1 condition going on. Luckily, Rails makes it quite easy to solve N+1 queries. Often it’s as simple as calling .includes() in your query-chain to pre-load the association on all result records. We recommend reading up on solving N+1 issues if you’re seeing many of them in your application. While solving these will improve your application’s overall response time performance, they will also help alleviate memory issues you might be facing.

Handle Files Better

The last piece of ‘dig-in’ advice we’ll share here revolves around how we can handle files in Rails. Typically, this entails how we handle user’s files as they upload them to our servers, but it technically any time our Rails application is interfacing with files directly, we should be careful.

Ultimately, memory issues deriving from file-handling can be ticking time bombs! Until the right user uploads the right file, you might never know you have an impending memory crash on the way. But once the right file comes in… look out!

The simplest rule of thumb with files, especially user files, is to avoid loading the entire file into memory. Even from day zero, when you’re just implementing a proof-of-concept that totally won’t be the production code (😏) make sure that you never load full files into memory. See, this:

file_content = File.read(uploaded_file.path)

Is dangerous. If I somehow upload a 4.5GB spreadsheet to your system, you’ve now run code that will attempt to load that full file into memory. ☠️ Not good.

Instead, in conjunction with ensuring you have safe-guards around file types, formats, and verifications around the actual file contents matching the data types you expect, we should stream files. For text files, that looks like:

File.open(uploaded_file.path, 'r') do |file|
  file.each_line do |line|
    process_line(line)
  end
end

For binary files (images, PDF’s, videos, etc.), that might look more like:

File.open(binary_file.path, 'rb') do |file|
  while (chunk = file.read(1024)) # Read 1KB at a time
    process_chunk(chunk)
  end
end

Now, we obviously can’t dictate an ‘answer’ here — every application is going to have its own needs around the files it deals with and what it should do with that data. We just want to raise a big flag here: “STREAM FILE CONTENTS” to help you avoid memory headaches in the future!

Also, we should call out safeties around sending files to users. If you intend to send files via the responses from your Rails controllers, you should take special care to ensure that these files are streamed too. You don’t want the Rails application to generate / build / read / etc. the entire file into memory then deliver it to a client browser / system in full. The client / browser should stream chunks from your Rails application in kind.

Luckily, that can be as simple as adding stream: true:

class FilesController < ApplicationController
  def download
    file_path = Rails.root.join('path', 'to', 'large_file.zip')

    send_file file_path,
              type: 'application/zip',
              disposition: 'attachment',
              stream: true,
              buffer_size: 64.kilobytes
  end
end

Using Less Memory on Heroku

There we have it! While memory management with Rails on Heroku can feel like walking a tight-rope at times, hopefully we’ve given you some strategies and ideas to alleviate your memory pressure here! Balancing dyno memory constraints, throughput needs for your application, and our specific coding styles / method choices is… tricky. But with a keen eye and conscious mind, we can run some fantastically memory-efficient Rails apps on Heroku!

Ultimately, the key is awareness. Memory issues rarely appear overnight; they’re the result of small inefficiencies accumulating over time. Regularly monitoring your app with your APM, profiling your memory usage, and auditing your codebase for inefficiencies are essential practices for maintaining a lean and reliable Rails app. Heroku’s platform can work wonders for simplicity and scalability, but it requires diligence to ensure your app is staying within its memory budget.

And, if you’re already facing R14 or R15 errors, don’t worry! Start small, focus on the low-hanging fruit, and work methodically. By following the steps and checks we’ve outlined, you’ll have a healthy, memory-efficient Rails app in no time. 🚀