Let’s take a little time to learn more about job queue depth.
What is Job Queue Depth?
When talking about jobs, we’re typically using two key metrics to measure our queues:
Queue depth: how many jobs are in a queue waiting to be processed; also known as queue size
Queue latency: how long any given job waits in the queue before it’s processed; also called queue time
These metrics help us understand how healthy the queues are, how fast they’re running, and when we’ve hit a bad point.
Queue depth is usually easier to visualize, but it is not necessarily the most important metric for autoscaling, as it can be misleading sometimes. Imagine two queues, both single-threaded:
Queue A has 10 jobs enqueued. Each job takes one second to run
Queue B has 10,000 jobs enqueued. Each job takes one millisecond to run
One of these queues might appear to be “backed up” because it has a high queue depth (10,000 jobs), but in reality the queue “health” is the same—they will both clear their backlog in 10 seconds. That’s why queue latency is usually considered a better metric for autoscaling.
When to Use Queue Depth?
It’s no secret that we’re big fans of request & job queue time autoscaling, but certain job backends may not have queue time available, or maybe your specific use-case or business need could benefit from autoscaling based on queue depth.
In those cases, it’s worth giving queue depth autoscaling a try. And autoscaling with queue depth is certainly better than no autoscaling at all.