Autoscaling web dynos
Judoscale can autoscale your web dynos out-of-the-box with one click.
In practice, you’ll probably want to review your web metrics and autoscaling settings first.
Understanding your web dyno metrics
As soon as you install the add-on, Judoscale will begin monitoring your response time, throughput, and scale (how many dynos are running).
The green shaded area is your “response time range” for autoscaling. An upscale would be triggered for any peak above this range. Downscaling is triggered when the metric settles below the range for a period of time.
You can use your metrics as a guide for how to configure your response time range for autoscaling.
The response time metric is available for all application since we extract it from Heroku’s router logs. However, for more reliable autoscaling, you can autoscale based on request queue time by installing the Judoscale adapter library for your language or framework.
Configuring web autoscaling
In addition to your response time range (or queue time range if you’ve installed an adapter), you can configure your dyno range, upscale jumps, and downscale delay.
Your dyno range is a way of limiting risk exposure. Some customers never want fewer than two dynos running, even when their app is under very light load. Others are comfortable scaling down to a single dyno.
Heroku recommends running at least two dynos at all times for redundancy, but we believe that to be overly conservative when autoscaling is in place. You can safely scale down to a single dyno, and Judoscale will quickly upscale your app if there’s a problem on that dyno.
Your dyno jumps and downscale delay are a way of controlling the sensitivity or cadence of autoscaling. If you want to recover from a slowdown as quickly as possible, consider scaling up by multiple dynos at a time. If you want to maximize cost savings and resource efficiency, consider shortening the downscale delay.
Read more about these settings in Refining your autoscaling behavior.