Add an NGINX sidecar to expose request queue time

While the package installation alone will allow background job (worker) services to connect to Judoscale, web services require an additional step. In order to correctly determine request queue time for web services, Judoscale reads a unique header on each incoming request: X-Request-Start. Some hosting platforms automatically add this header to every incoming request. Amazon ECS does not. So we’ll need to add it ourselves!

Our recommendation is to follow the sidecar pattern, adding a second container to your web service’s task definition. This container is essentially a very minimal web-server — NGINX, for example. Its job is to simply add the header and proxy the request over to your application container as if it’d received it in the first place. We covered this more exensively in our blog post, “How Our Amazon ECS Autoscaling Works,” but the sidecar pattern is a stable and reliable pattern for header-injection.

We’ve also made our own sidecar container available from the Amazon Elastic Container Registry if you’d prefer a drop-in solution. This particular container does require that your application be mounted on port 3000. That container can be found here, but you should feel free to build your own in your own container pipeline — the Dockerfile is simply:

FROM nginx:latest

COPY nginx.conf /etc/nginx/conf.d/default.conf

and the nginx.conf is simply:

server {
     listen 80;

     location / {
         proxy_set_header X-Request-Start "t=${msec}";

         proxy_pass http://localhost:3000;
     }
 }

For our particular application, that’s all that’s required. Since we’re running on Fargate instances, all containers running within a single task share their localhost networking.

🚨 Warning

We offer our pre-fab container (and code sample above) only as an example; your specific application configuration may vary. While the sidecar pattern should work to inject headers into any application’s web service, you may have differences or preferences in port numbers, inter-container networking, and/or proxy web-server choices. As long as the X-Request-Start header is added to the request before it reaches your application, Judoscale will be able to properly determine your request queue time.

Frequently Asked Questions

Can I add the header somewhere other than the sidecar?

You may be tempted to push header injection deeper into your stack—for example, in Puma via Rack middleware. Unfortunately, that approach doesn’t work because the queue time we need occurs before the request reaches your application server. The sidecar (or an upstream load balancer) must add the header before the app receives the request. In theory, if your ECS task is under such extreme load that NGINX struggles to respond, queue time reporting could lag, but we haven’t observed that in practice.

Will Cloudflare or another CDN work instead of the sidecar?

Yes, you can add X-Request-Start at the CDN layer, and several teams have done that successfully. Keep in mind that Judoscale will then measure the time from the CDN edge through your entire network stack to the app container. That extra distance means higher baseline queue times and potentially more variance, so be prepared to adjust your thresholds accordingly.

Can I rely on AWS X-Amzn-Trace-Id instead?

The AWS trace header does include a timestamp, but it’s rounded to whole seconds. That level of precision isn’t sufficient for Judoscale queue time calculations, so we don’t recommend relying on it as a drop-in replacement for X-Request-Start.

Verify Incoming Metrics

Once you’ve installed the package, setup a sidecar (for web services), and deployed your application, Judoscale will begin showing your queue metrics in the Scaling page charts.

Screenshot: scaling charts in Judoscale

If your web service isn’t receiving traffic, or if your worker service has no jobs waiting in queue, you won’t see any activity in the charts. Let it collect metrics while your app is under load to see queue time information.