A common complaint of Heroku is that as you scale, it can become expensive much faster than you’d expect. If you’re fortunate to have enough usage that your need for compute grows, you have to pay Heroku more for increased usage than if you were hosting your infrastructure yourself. This is part of the tradeoff of using a platform like Heroku.
Still, scaling costs are a good problem to have! In this article, we’re going to take a look at what it’s like to scale on Heroku to meet demand. We’ll look at the differences between horizontal and vertical scaling, the different dyno options on Heroku, and even autoscaling! Let’s get into vertical scaling first.
Vertical scaling
Vertical scaling is the process of adding more resources (RAM, CPU, etc.) to a server to make it more powerful. On Heroku, this is done by upgrading to a larger dyno that has more resources. You can change your dyno manually in the Heroku dashboard, and it’s relatively quick to do.
Benefits of vertical scaling
One benefit of vertical scaling is that it can help you handle more computationally complex tasks. Things requiring a lot of data manipulation in memory, even background jobs, can be made possible by adding more memory to the server. On Heroku, this looks like upgrading to a dyno that has more memory.
Vertical scaling can also aid in speeding up completion of things that are bottlenecked by CPU or memory, so it makes sense to at least add enough resources to handle your most complex workflows, whether they’re background jobs or web endpoints.
Another reason people choose to scale vertically is that it is easy to implement. You can simply choose a higher-tier Heroku dyno in the Heroku UI under the “Resources” tab.
Limitations of vertical scaling
The most immediately clear problem with vertical scaling is cost. Costs on Heroku can quickly go from $25 per month to hundreds of dollars per month as you move to higher and higher tiers of dynos. You’ll also notice diminishing returns on those extra dollars spent.
The actual process of “scaling up” vertically isn’t automated. You have to change your dyno type on the Heroku dashboard, which will restart your service, causing at least a brief downtime.
Vertical scaling isn’t dynamic or automatic. It’s not a great option for responding to demand changes, especially if they’re common or cyclic. It’d be silly to wait for traffic to increase, upgrade your Heroku dyno to have more resources, let it restart, then downgrade when load goes back down.
Responding to demand by upgrading your dyno is made even harder by the fact that taking full advantage of increased resources like CPU cores will require some application configuration changes that would have to be deployed as well. Higher tier dynos will offer more CPU cores, something that is most useful when your application uses multiple processes. In GVL-constrained languages like Ruby and Python, you’ll need to tweak some environment variables to run extra processes for your app to get the added benefit of the extra CPU cores.
Horizontal scaling offers an alternative or even a complement to vertical scaling, whether it’s because you’ve vertically scaled as much as you can or any other limitation. Let’s dig into that next!
Horizontal scaling
I know I’ve already hinted at it, but I think horizontal scaling is a lot more useful in a lot more cases. Horizontal scaling is the general process of adding more servers, as opposed to adding more resources to those servers.
On Heroku, you can change the number of dynos that your application is using with the famous dyno slider.
Once you have an appropriate amount of resources on a dyno, it often makes sense to increase the dyno count when you need to instead of continuing to add resources to a single dyno.
Benefits of horizontal scaling
Adding more dynos gives you the chance to use concurrency to your advantage. Whether you’re serving more web requests or processing background jobs with multiple workers, running more dynos lets you do more tasks at the same time.
You can generally scale-out job queues with no changes at all. Job queues are built to be distributed, but scaling horizontally opens up more options for how you distribute work. This might require some changes to your job setup, but that’s an optional optimization.
Beyond that, having more than one dyno gives you some resiliency against failure since you have more processes available. If these two things aren’t enough, have I mentioned that it’s often cheaper than scaling vertically? That’s right - dollars spent on performance can go further when they’re spent on more servers instead of bigger servers.
Part of that cost-effectiveness comes from autoscaling. Horizontal scaling is much more compatible with dynamic load than vertical scaling. You can add dynos when a service is under heavy load and take those servers away when they’re not needed.
This ensures you’re not paying for more resources than you actually need - something vertically scaling is not as compatible with.
It probably doesn’t surprise you, but this requires a nuanced answer. To quote many software developers:
It depends!
It often makes sense to scale your dynos vertically as much as you need to, but no more. Then you can scale horizontally. If you are already using one large, powerful dyno, scaling horizontally just adds more of those powerful dynos.
If you’re using a performance dyno and that’s costing you $250 per month, adding just one dyno makes that $500 per month. The choice between $250 and $500 per month is a huge difference.
Contrast this with scaling smaller (and cheaper!) dynos, and you’ll see that smaller dynos give you more precision when adding and removing to your dyno count.
Using a larger dyno reduces the granularity of your horizontal scaling, making your efforts to add or remove servers more dramatic (and expensive). Using smaller dynos allows you to add and take away small bits of compute at a time, affecting your costs in a less dramatic way.
You can also take a different approach for each service that makes up your application. Your web service and your worker services don’t need to run the same kind or number of dynos. If your web service needs a more expensive dyno but you can just scale your worker service horizontally, do that!
Understanding the Heroku dyno options
If you haven’t already guessed from our usage throughout the article, “dynos” are what Heroku calls the containers that run code. They’re isolated environments that come with associated resources, so you select a dyno type that aligns with the resources you need. More powerful dynos cost more money (of course), so it’s in your best interest to not over-allocate with resources you don’t need.
The layout of the dynos that make up your application is called the “dyno formation”. By default, the dyno formation for an app is just a single web dyno. If you have background workers, you can run those in worker dynos. You can choose your web and worker dyno tiers independently from each other, and you can scale the dyno count independently from each other as well.
Choosing a dyno tier requires more thought than just resource needs; there are some feature differences as well! Most notably, the basic and eco dyno types don’t allow for horizontal scalability, even if you wanted to do it manually. If you want to run multiple dynos, you’ll need at least a standard tier dyno.
If you’re horizontal scaling, you probably want the chance to autoscale. Unless you’re using a third-party autoscaler like Judoscale, the Heroku autoscaler isn’t available except on performance dynos.
Autoscaling Heroku apps helps you handle traffic changes
Every application should have some sort of plan in place for scaling in order to ensure acceptable performance regardless of load. Even hobby projects can quickly pick up traction and struggle to keep up.
Autoscaling, adding and removing dynos in response to load without manual intervention, is likely a key part of that plan. Heroku has a built-in autoscaler, which automatically changes the dyno count when response time reaches defined thresholds, helping you keep up with demand without paying for more dynos when they’re not needed.
First, it’s only available for performance, private, and shield dynos. Autoscaling can be helpful even if you don’t need these higher-tier dynos, so this is frustrating.
Second, the Heroku autoscaler is slow. It autoscales based only on response time, which isn’t always an accurate or timely way to represent dyno capacity.
Judoscale, on the other hand, is a third-party autoscaler that can easily integrate with Heroku. It’s available for all the Heroku dyno types, which means it can help you even if you’re not using performance dynos. Judoscale scales dynos based off of queue time, a better metric for quickly determining an application or queue needs more capacity.
Once you’ve selected the smallest dyno that fits your needs, be sure to consider horizontal autoscaling as an automated way to stay on top of your demand. Whether it’s the Heroku autoscaler or Judoscale, you can rest easy knowing you’re better prepared to handle changes in traffic.