Dealing With Heroku Memory Limits and Background Jobs

Adam McCrea headshot

Adam McCrea

@adamlogic

I added one background job and now I’m priced out of Heroku.

I’ve heard some variation of this too many times to count. Your app hums along fine on Standard dynos…until you add video encoding, giant imports, or some other memory‑hungry job. Suddenly your worker needs a bigger box, and upgrading every worker to Performance dynos feels like buying a school bus because you might carpool once.

There’s a simple pattern that keeps your bill sane and your architecture boring (the good kind): Put the heavy job on its own queue, give it a dedicated worker process, and autoscale that process to zero when it’s idle. The rest of your app stays on Standard dynos.

This post focuses on a real example from Justin Searls on his podcast, Breaking Change—and exactly how I’d set this up on Heroku.

Justin’s story: 4K video meets 1 GB dynos

Justin’s adding support for Instagram Stories (and soon Facebook) to POSSE Party, his tool for syndicating your own content to social media. The shape of the problem:

  • A minute of 4K HDR video is 700–800 MB.
  • Instagram only accepts 1080p within strict codec limits.
  • Custom server-side re‑encoding produces a compliant, compressed file.

Everything worked fine, until a real 4K file hit the server. On Heroku Standard dynos (1 GB RAM), FFmpeg spikes over 1.2 GB during 4K→1080p. Heroku starts swinging the OOM hammer. Sometimes the encode finishes just in time before Heroku shuts it down. Sometimes not. Either way, it’s “no way to live.” (Justin’s words)

2025-10-29:29:13.606617+00:00 heroku[worker.1]: Process running mem=844M(165.0%)
2025-10-29:29:13.608658+00:00 heroku[worker.1]: Error R14 (Memory quota exceeded)
2025-10-29:29:38.911996+00:00 heroku[worker.1]: Process running mem=810M(158.3%)
2025-10-29:29:38.913195+00:00 heroku[worker.1]: Error R14 (Memory quota exceeded)
2025-10-29:30:03.936646+00:00 heroku[worker.1]: Process running mem=1197M(233.8%)
2025-10-29:30:03.942490+00:00 heroku[worker.1]: Error R15 (Memory quota vastly exceeded
2025-10-29:30:03.944352+00:00 heroku[worker.1]: Stopping process with SIGKILL
2025-10-29:30:04.133323+00:00 heroku[worker.1]: Process exited with status 137
2025-10-29:30:04.179846+00:00 heroku[worker.1]: State changed from up to crashed

The apparent choices: 1. Upgrade to Performance dynos (ouch, $$$, especially if you upgrade all workers), or 2. Break out encoding to a separate service you run elsewhere (more moving parts, Active Storage integration gets annoying), or 3. Do it client‑side with WebCodecs (promising, but HDR tone‑mapping and codec constraints are tricky).

There’s a fourth option that’s dead simple and keeps everything on Heroku:

Isolate the heavy job on its own queue, back it with a dedicated worker that uses a Performance dyno, and autoscale that worker from 0→1 only when needed.

The perf dyno runs for a few minutes a month, all automated, costing almost nothing.

The “dedicated worker” pattern (at a glance)

  1. Create a dedicated queue for heavy jobs, e.g. “memory_hog”.
  2. Run a dedicated worker process that only monitors memory_hog with concurrency = 1.
  3. Set that process’s dyno type to a Performance size. Leave the quantity at 0.
  4. Autoscale that worker based on queue latency (queue time).
  5. Enqueue jobs; let autoscaling do the rest.

If you’ve read our post on planning your Sidekiq queues, you know I’m a huge advocate for latency‑based queue names. This is the exception. When memory is the constraint, name the queue accordingly so its purpose is obvious and you can add similar jobs later.

Step by step setup

Let’s walk through the actual implementation of this pattern. I’m focusing on Sidekiq and Heroku here, but you can apply the same concepts to any job/task queue and cloud hosting platform.

1. Point the heavy job at a dedicated queue

# app/jobs/encode_video_job.rb

class EncodeVideoJob
  include Sidekiq::Job
  sidekiq_options queue: :memory_hog

  ...
end

2. Add a dedicated worker process

# Procfile

worker: bundle exec sidekiq -c 5 -q within_5_seconds -q within_5_minutes
memory_hog_worker: bundle exec sidekiq -c 1 -q memory_hog

Keep it single-threaded (-c 1) to avoid multiplying memory usage. If you truly need parallel encodes later, raise carefully.

A diagram showing a normal worker (standard dyno) and a separate memory_hog worker (perf dyno)

3. Set the dyno type, not the dyno count

In the Heroku dashboard, open the memory_hog process and choose a Performance dyno size (whatever meets your memory needs). Leave the quantity at 0. This only tells Heroku what kind of dyno to use when you scale up later.

Heroku process settings for memory_hog

4. Wire up autoscaling in Judoscale

You can use any autoscaler that can scale your workers based on job queues. This is the Judoscale blog, so we’re using Judoscale.

  • Process: memory_hog
  • Scale range 0—1 dynos (maybe more?)
  • Scale up when queue latency >= 1 second (anything in the queue)
  • Scale down when queue latency drops below 1 second (essentially idle)

Judoscale rule screenshot — “Scale 0–1 when Queue Time hits 1 second.”

Now when you enqueue a video, queue latency rises, Judoscale starts one Performance dyno for memory_hog, FFmpeg runs, and the worker scales back to 0 when the queue drains.

5. Test it end‑to‑end

  1. Enqueue a big video.
  2. Watch queue latency bump.
  3. See Judoscale scale memory_hog 0→1.
  4. Encode finishes; dyno scales back to 0.
  5. Celebrate not paying for a big box 24/7.

Gracefully handling long-running jobs

Jobs that consume a lot of memory also tend to be long-running jobs. We consider a “long-running job” to be any job that takes longer than the shutdown timeout for the given job processor, which is usually 25 seconds.

Out-of-the-box, Judoscale will downscale a worker service as soon as the queue is empty. As long as your jobs complete within the shutdown timeout, this is fine—The worker will receive the shutdown signal, finish the job, then shut down. If the job takes longer, it’ll be killed before it can finish 👎.

Judoscale handles this scenario with an opt-in configuration to prevent downscaling when jobs are busy.

Screenshot: Option to prevent downscaling when jobs are busy

To see this option in the UI, you must enable “busy job” tracking in your code. Check out the docs for details.

Alternative approaches

Hopefully it’s obvious why this beats the “just upgrade everything” approach. It’s completely unnecessary! If you only need a perf dyno for one occasional job, there’s no reason to pay for it 24/7.

The other alternative I often hear (and Justin mentioned in the podcast) is extracting this work into a separate service outside of Heroku. It’s cliche at this point to talk about how cheap hardware is if you get closer to the metal. Yes, you pay a substantial tax to use a PaaS like Heroku, but that tax buys your time. There’s simply no reason to waste time spinning up new infra when simple (and cheap!) solutions exist on your current platform.

👀 Note

For more general advice on reducing memory usage in Rails, check out our other posts: How to Use Less Memory, Part 1 and Part 2.

Take action

Judoscale was built for exactly this: autoscaling workers by queue latency, including scaling to zero. Turn it on, ship your feature, and leave the school bus at the dealership.

If you want help wiring it up, email us or call us.