Our new autoscaling service for Amazon ECS services and clusters is live! 🎉
But how does it work? It’s proprietary, of course… (prepares NDA)
Totally kidding. We’ve gotten several questions around the Judoscale/ECS integration, so we wanted to write a quick post that walks through some of the basics. Let’s start with the adapter.
Adapter and Queue Time
Running Judoscale requires the installation of an adapter into your application, and we offer several: Rails, Express, Django, Flask, and more. We’re excited to bring them to Amazon ECS! While the adapters do several background operations to track your various systems’ queue times, tracking queue time from incoming web requests requires a bit more configuration.
Our adapters passively and transparently read a special header present on incoming web requests: X-Request-Start
. Amazon’s various routing systems don’t add this header for us the way some other platforms do. So we need to add it ourselves!
While there are several ways to accomplish the header addition, and many of them do work with our system, the method we recommend is the Sidecar Container pattern within the Task Definition of the web-server process. If your particular application manages incoming requests and routing in another style, feel free to configure the header there and we can help you determine if it’s working properly.
In a quick walkthrough of the Sidecar pattern, let’s start with a simple Task Definition that just runs a plain Rails Server process. It would look something like this:
In this case, the Task Definition has a single container (the Rails Server process) and that container fields incoming requests on port 80 directly. The Sidecar pattern essentially injects another container (the ‘sidecar’) into the Task Definition which acts as a proxy / intermediary before the request makes it to the application. That looks more like this:
The NGINX config at play here is extremely minimal:
server {
listen 80;
location / {
proxy_set_header X-Request-Start "t=${msec}";
proxy_pass http://localhost:3000;
}
}
And, if this specific combination (listen on 80, pass to 3000) matches your needs, you’re welcome to use the publicly-available container we prepared here. If you prefer to build your own, you need only two lines of a Dockerfile to build the fully functional container (where nginx.conf
is the above config):
FROM nginx:latest
COPY nginx.conf /etc/nginx/conf.d/default.conf
Once your sidecar is running, the Judoscale adapter should start reading the request header and observing request queue times automatically.
AWS Permissions and ENV
The second piece of the how-it-works puzzle revolves around AWS platform-level setup. This consists of three steps that deal less with your application code and more with your infrastructure itself, but all three steps are guided and automated by Judoscale (though not required to be if you prefer custom).
ECS Read Permissions
When you create a new Amazon ECS team in Judoscale, you’ll be greeted with this screen:
This is our default automation step that offers to run our pre-fab CloudFormation template for you. That template is linked directly in the view and you can see it here.
The tl;dr: is that this script simply creates ‘allow’ permission rules, specifically for Judoscale, for ecs:Describe*
, ecs:List*
, iam:ListAccountAliases
, and account:ListRegions
. These permissions allow us to populate the clusters list in the screen that follows:
And, upon clicking “Link” for a cluster, the same permissions allow us to list out and prepare the services within that cluster:
It’d be tough to autoscale an app we can’t read!
Judoscale ENV
Once you select a Service to link and get your framework-specific Judoscale adapter installed, you’ll be given a unique ENV value. That’ll look something like this in the UI:
This ENV value needs to be present in any task/container running a Judoscale adapter, but Judoscale doesn’t dictate the means by which you implement that requirement. We’ve worked with teams that hard-code the value into their Task Definitions, teams that add the key-value pair into their Terraform setups, teams that use a Parameter/Secrets Manager structure, and plenty of others. We leave that up to you, but we’re here to help if you’re not sure which path to take! (Did you know we have open office hours?)
Once you have the Judoscale adapter running and the environment variable in place, both the “Finished and Deployed” button and the page background will be your feedback confirmation. Clicking “Finished and Deployed” will either give you an error message of “We haven’t quite seen data come in yet…” or it will confirm that we are seeing data come in from your Service. Similarly, the charts in the background will begin moving and changing in real-time to reflect your Service’s traffic and queue times once your setup is working properly.
Write Permissions
Alright, so at this point we’ve got our Secret Sauce (™️) mostly formed — we have the AWS read-permissions we need to know about your clusters/services, we have the Judoscale adapter running in your application’s runtime, and we’ve confirmed the adapter is successfully reporting queue times… now we just need to scale! Automatically, even 😉
Changing your service’s scale count requires the last piece of first-time setup: AWS write permissions for your Service(s).
The first time you click the “Autoscaling On” switch for a given service, you’ll be presented with the final modal: another automation to add write permissions:
Once again that script is linked in the view and you can see the raw source here.
But the tl;dr: on this one is that we update the same permission (role) we created in the original “read permissions” script and add an ‘allow’ for ecs:UpdateService
only. That singular permission is what authorizes our platform to change your service’s scale count, and we do that by simply setting the service’s “Desired Scale Count” field. That’s it!
The Fully Working Machine
We’ve got all the permissions and data sorted out, and autoscaling is running smoothly at this point! So let’s get back to the primary question; how does it work?
Let’s take an illustrative approach… let’s say you’ve got a small ECS cluster — just a web service and a single worker service:
But let’s not be too simple — we have multiple task instances running in each of those services, so our model might be better represented as so:
The first layer we’ll add here is the Judoscale adapter: its job is to transparently report queue time data from each individual instance in any given service to Judoscale:
As the data reaches Judoscale, it’s continuously scanned and monitored at a per-service level to ensure that the service is staying within its set scaling parameters. That data is also exposed in the real-time Judoscale UI so you can observe and understand your system health directly:
And as soon as one of your services begins reporting queue times that breach your chosen autoscaling settings, Judoscale automatically adjusts your scale count on AWS to compensate.. quickly. We generally prompt AWS to upscale within 10-20 seconds of detected slow-downs! Let’s illustrate that like this:
And that’s essentially it. The fully operational autoscaling machine is simple at its core: receive lots of data, continuously scan it all in real time, find breaches and outliers, and adjust the scale for services that need more (or less) instances depending on their reported queue times (the only truly reliable metric to scale on). Smooth and simple.
Thanks for tuning in to this little how-it-works! We’re really excited to GA this integration and we’re looking forward to supporting the teams that have been asking for it. Amazon ECS on Judoscale is a new frontier for fast, extremely-responsive infrastructure, and it’s ready right now. Scale on, friends!