Understanding Request and Job Queue Time for Optimal Deployment

Let’s take a little time to learn more about request (and job) queue time — the secret to an optimal deployment on any platform.

What is Request Queue Time?

Put simply, request queue time is the time between receiving a request and your app beginning to process the request.

Queue time includes (1) network time between the Heroku router and application dyno and (2) time waiting within the dyno for an available app process/thread to handle the request. The latter is what we care about—if requests are waiting for more than a few milliseconds, there’s a capacity issue.

This is why Judoscale only scales based on queue time. Web requests can be slow for lots of reasons, but queue time always reflects true capacity.

👀 Note

Check out our complete guide to request queue time for a full breakdown.

What is Job Queue Time?

Job queue time is how long jobs are waiting in a queue before being processed. Judoscale measures job queue time per queue, and you select which queues you want to use for autoscaling.

You can think of job queue time like an SLA (service level agreement) for your jobs. Some jobs might need to be processed within 30 seconds, while others are fine to wait for a few minutes. Using job queue time for autoscaling aligns your scaling behavior with your business needs.

How Is Judoscale Different From Heroku’s Autoscaler?

Heroku offers a native autoscaling solution that’s worth a try if you run performance dynos and you only need to autoscale web dynos. Here’s what makes Judoscale different:

Web autoscaling based on request queue time instead of total response time. This means more reliable autoscaling.
Worker autoscaling for many background job frameworks
Works great on standard and performance dynos.
Personalized customer support from developers like you.