Learn More

Optimizing Autoscale Settings: When to Tune for Performance & Costs

When Tuning Is Needed

It should be noted that the default settings often work well for the majority of applications. However, there’s no formula for determining the perfect autoscale settings. The best way to optimize for your app is to start with the defaults and observe how your app is autoscaled.

If your app has trouble recovering quickly from slowdowns, you might benefit from scaling up by multiple dynos at a time.

If you want to optimize for cost savings, you might want to downscale faster by shortening the time between downscale events.

Upscale Jumps

By default, Judoscale will scale up or down by one dyno at a time. It will continue to do this until queue time has settled within the target queue time range (between the upscale and downscale thresholds).

If your app receives sudden bursts of traffic and needs to scale up as quickly as possible, you might want to scale up by multiple dynos at a time.

Screenshot: Upscale jumps control

You can scale up by as many dynos as you want, but keep in mind you’ll never be autoscaled beyond your autoscale dyno range.

For example, let’s say your dyno range is 2–10 dynos, and you’ve set “upscale jumps” to 2 dynos. If you’re currently running 7 dynos, the first upscale event will jump to 9 dynos, and a subsequent upscale event would scale to 10 dynos—your dyno range would prevent a 2-dyno jump.

Upscale Frequency

Upscale frequency is the cool-down between consecutive upscale events. It has no effect on the first upscale—only on how quickly Judoscale can add even more capacity if queue time (or utilization) is still beyond your targets.

  • Keep the frequency at least as long as it takes for a new dyno, task, or machine to come online. On Heroku that’s usually 30–45 seconds; on ECS it can be closer to a minute. If you upscale again before the previous capacity is live, you risk overshooting and paying for extra containers.
  • If your dashboards show multiple upscales during deploys, try increasing the frequency so Judoscale waits long enough to see the first deploy tasks finish booting. That often reduces “double upscales” caused by brief launch delays.

Upscale Sensitivity

Upscale sensitivity is the minimum amount of time a metric must remain above its target before Judoscale will send an upscale request. The cap of 30 seconds is intentional: waiting longer would delay genuine responses to real traffic spikes. For deploy-related queue spikes, pair sensitivity with the tips above—lengthen the upscale frequency so we have time to observe the new capacity, and consider scheduling deploys during lower-traffic windows.

If you need to suppress scaling entirely during a deploy, you can disable autoscaling through our Settings API at the start of the rollout and re-enable it once the new tasks are in service.

Downscale Delay

This downscaling option controls the time between subsequent downscales, and it also controls the time before the first downscale event.

Screenshot: Downscale delay control

Downscaling is not triggered immediately when queue times drop below the downscale threshold because this could result in yo-yo autoscaling—rapidly autoscaling up and down. Judoscale waits to downscale until queue times have settled and remain below the downscale threshold for 10 minutes (the default), or whatever you specify here.

After that, your app will continue downscaling at the same rate (every 10 minutes by default) as long as queue time remains completely below the downscale threshold.