Java
Step-by-Step Guide to Installing The Judoscale Package for Spring Boot
Guided Install
When you launch Judoscale for the first time, you’ll be prompted to install a package that’s specific to your app’s language and framework. In the package installation modal, select “Java” for your language and “Spring Boot” for your web framework.
Based on your selections, you’ll see instructions for installing the correct package and integrating it with your application.
đź‘€ Note
We now have two Spring Boot libraries: judoscale-spring-boot-starter for Spring Boot 3.x and judoscale-spring-boot-2-starter for Spring Boot 2.x (legacy support). Choose the one that matches your Spring Boot version. Check the judoscale-java repo README for the latest version of the library and detailed installation notes.
Verification
Once you’ve committed and deployed these changes, click the button at the bottom of the modal to verify your package setup.
Once verified, the modal will close, and Judoscale will begin collecting throughput and request queue time metrics.
Note that autoscaling is still turned off. You’ll want to review your configuration before enabling autoscaling.
How It Works
Once configured, the library works automatically. It will:
- Measure request queue time — Captures the time requests spend waiting in your platform’s router queue before reaching your application.
- Measure web utilization — Captures how often your web servers are busy handling requests.
- Report metrics — Sends collected metrics to the Judoscale API every 10 seconds.
đź‘€ Note
The library is automatically disabled in development or any environment where JUDOSCALE_URL is not set. It’s safe to include in your project without affecting local development.
Queue Time and Tomcat Concurrency
When autoscaling a Spring Boot application, it’s important to understand how Tomcat’s thread concurrency affects your queue time metrics.
Tomcat’s default thread pool size is 200. Queue time measures how long a request waits between the load balancer and Tomcat picking it up. With such a high concurrency threshold, Tomcat can handle many simultaneous requests before any start queuing—which means you’ll only see queue time increase once all 200 threads are busy.
For most applications, this is a lot of traffic. If your app rarely saturates 200 threads, queue time may not be a reliable indicator of when to scale.
When Queue Time Works Well
If you’ve configured Tomcat to use fewer threads (e.g., via server.tomcat.threads.max), queue time becomes a more sensitive and reliable metric. With a lower concurrency threshold, requests will start queuing sooner, giving Judoscale earlier signals to scale up.
Our Recommendation: Use Both Metrics
With the default concurrency of 200, we highly recommend enabling the utilization metric alongside queue time. Utilization tracks how often your web server is busy handling requests—if any threads are active, the server is considered busy. This provides a proactive signal before requests start queuing.
Using both metrics together gives you the best of both worlds:
- Utilization helps scale up proactively as your server stays busy
- Queue time acts as a safety net, ensuring you scale when requests actually start waiting
Learn more about the utilization metric and how to configure it in our Understanding Utilization guide.
Troubleshooting
If metrics aren’t being reported, check the following:
- Ensure the
JUDOSCALE_URLenvironment variable is set - Ensure your deploy has completed with the library installed.
- Enable debug logging to see detailed information about metric collection
For additional help, see our troubleshooting guide or contact us.