Learn More
Effective Response Time Autoscaling for Any Programming Language
We offer autoscaling web dynos using response time on Heroku as a fallback option when we don’t have a package for your language or framework. This is how Heroku’s autoscaler works, and while it’s not as effective as queue time autoscaling, it’s much better than no autoscaling at all.
When should you use response time autoscaling?
You should use response time autoscaling when queue time autoscaling isn’t an option—in other words, when we don’t have a package for your language. We currently have package for Ruby, Python, and Node JS.
If your app is written in Java, PHP, Go, Crystal, or any other language, you can still autoscale effectively with response time.
👀 Note
You can choose to use response time autoscaling even if there is a package for your language / framework, but we recommend autoscaling with queue time when available.
How do you set up response time autoscaling?
When you launch Judoscale for the first time, you’ll see our package setup wizard.
If we don’t have a package for your language, you’ll be prompted to “monitor response time”. This sets up a log drain from your Heroku app to Judoscale so we can parse your router logs to get response time data.
Once you’ve enabled response time monitoring, you will see your response time data in the Judoscale dashboard. Review the autoscale settings (the defaults are usually a great place to start) and click “Save and Enable Autoscaling”.
What languages are supported for response time autoscaling?
All of them. Heroku records service time for every request at the router, so any language that can run on Heroku can be autoscaled with Judoscale’s response time mode. When you choose “monitor response time” we automatically provision a log drain, ingest those router logs, and aggregate the timings for autoscaling.
If we offer a queue-time adapter for your stack—today that covers Ruby, Python, and Node.js—you can still opt into response time, but we recommend sticking with queue time whenever it’s available. For everything else, including Go, Java, PHP, Crystal, or custom runtimes, response time autoscaling gives you full coverage without writing a custom adapter.
Pair response time autoscaling with schedules if you have known traffic swings. You can pin a minimum dyno count overnight, then allow Judoscale to scale up quickly as response time rises during busier periods.
What if my service already emits OpenTelemetry metrics?
On Heroku’s Fir stack we consume the OpenTelemetry metrics that the platform publishes, so you get response time autoscaling without managing a log drain. We don’t yet accept arbitrary OpenTelemetry pushes into our API, though, so the log drain approach remains the best way to autoscale custom runtimes today. If you need to report metrics yourself, you can still build a custom adapter and post queue time data directly.
How effective is response time autoscaling?
Of course we are big fans of queue time autoscaling, but response time autoscaling is still better than non-request-based metrics like CPU or memory usage. It’s also better than no autoscaling at all.
Naturally, some apps have web endpoints that are slower than others, and these requests can skew your response time data. This can cause response time autoscaling to have more “false positive” scale events than queue time autoscaling. You may want to increase your target response time range to compensate for this.
How does Judoscale response time autoscaling compare with Heroku?
Heroku’s autoscaler is based on response time, but there are some significant differences with Judoscale even when using our response time autoscaling:
- Judoscale works with all dyno types, not just performance dynos.
- Judoscale offers you more control over how your dynos scale. You can jump by multiple dynos at a time, and you can setup a schedule for predictable traffic dips or spikes.
- Judoscale is fast! We can respond to a traffic spike as quickly as 30 seconds, when it often takes Heroku several minutes.