In my first blog on Top Tomcat Performance Problems I focused on Database Access , Micro-Services Access and Bad Frameworks that impact your application performance and scalability running in your Java App Server. My second covered Bad Coding , Inefficient Logging and Exceptions . To conclude this blog series I focus on Exceptions in general, inefficient use of Pools and Queues , Multi Threading issues and Memory Leaks .
As a reminder – here is my complete Top 10 list which is also applicable to other App Servers – so keep reading if your app runs on Jetty, JBoss, WebSphere, WebLogic, Glassfish, …:
- Database Access : Loading too much data inefficiently
- Micro-Service Access : Inefficient access and badly designed Service APIs
- Bad Frameworks : Bottlenecks under load or misconfiguration
- Bad Coding : CPU, Sync and Wait Hotspots
- Inefficient Logging : Even too much for Splunk & ELK
- Invisible Exceptions : Frameworks gone wild!
- Exceptions : Overhead through Stack Trace generation
- Pools & Queues : Bottlenecks through wrong Sizing
- Multi-Threading : Locks, Syncs & Wait Issues
- Memory : Leaks and Garbage Collection Impact
Once again I want to offer a big “thank you” to all our Dynatrace Personal License users for sharing their data with me through my Share Your PurePath program. Without you I wouldn’t have so much information to share with everyone. Also, it is great to see that some folks have already accepted the challenge announced in my first blog post on this topic: proving me wrong or sending me new problem patterns. The winner gets a speaking session with me at a conference or user group meeting (and, of course, don’t forget, eternal fame!).
As I mentioned in parts 1 and 2: If your app suffers from any of these problems I can guarantee you that not even Docker, EC2 or Azure will help you by throwing more compute power on your problems J
Once again, I will be using Dynatrace as it is my tool of choice but my approaches should also work with other APM or Performance Diagnostics Tools! If you want to watch a video rather than reading this blog series check out my Tomcat Performance Analysis YouTube Tutorial instead.
Tomcat Performance Problem #8: Pools & Queues: Bottlenecks from incorrect sizing
There are many pools and queues that your application relies on. The incoming worker thread pools, outgoing connection pools for making database or web service calls. You also have your background thread pools to be used for asynchronous activity. It is important to keep an eye on all these pools, their maximum size and their utilization during peak load. If you expect 100 concurrent requests on your front-end JVM — and every call potentially makes five calls to the database on five external web service calls — you need to make sure that:
- You have enough JDBC connections to allow parallel access (assuming each request only needs one connection)
- Sufficient worker threads and HTTP Connections to execute these external calls
The following screenshot shows a Transaction Flow of a Search Transaction for a Job Search Website. Every search that comes in on Tomcat will first query the database for the jobs that match that search criteria. It is executing a total of 288! SQL Statements on five different JDBC Connections. For every single search result the code is then spawning a separate background thread (38! In total) to execute external web services to retrieve more details for every job result found. This alone binds 38 background worker threads and 38 outgoing HTTP Connections:
You have to analyze how many threads, database and http connections you consume per request. This will tell you whether this app can scale or not
If we take the example from above we can easily calculate what would be needed to e.g: sustain 1, 5 or 10 parallel search queries that result in an average of 10 or 20 search results:
Simple Excel table allows you to figure out how many threads and connections you need to sustain a certain amount of load.
The previous example is obviously something that you do not want to deploy in a large scale environment, especially the spawning of all these parallel threads that execute these external web service calls, which will become a huge bottleneck. If you consider that the default number of worker threads in Tomcat is 250 it means that you can only sustain 11 parallel searches that return an average of 20 search results. After that it means that Tomcat has to queue incoming requests.
Looking at individual transactions like the example above is great to identify the flaw of an implementation. In order to better understand what is going on in a system, when it is under load I recommend looking at the Connection and Thread Pool metrics exposed via JMX. Here is an example of monitoring the DB Connection Pool Usage for 4 JVMs in a Cluster. Easy to see that some of these pools are exhausted with 20 used connections:
Tomcat and other App Servers expose Connection Pool Size and Usage via JMX. Make sure to monitor it for every JVM in the cluster. Identifying exhausted pools becomes easy!
The root cause of exhausted pools is not only excessive usage. It can also be caused by e.g: some long running SQL Queries, long running batch jobs in async threads or long running external calls. In that case it is always good to have good timeout settings for these types of activities. If you expect a remote call to respond within 1s then there is no need to define a default timeout of 60s. If the call doesn’t return within 1s or lets say 5s (to give it a bit more time) it is better to abort that call and free your thread!
Another good sanity check is correlating incoming requests with the total amount of active threads. The following shows the Dynatrace Process Health dashboards where you see these two metrics. It is easy to spot that the JVM on average runs 26x more threads than incoming requests!
Correlating incoming transactions with total thread count is a great way to determine the extent to which your application is thread bound.
Tip for Dynatrace Users : If you create a new System Profile for your Java-based App make sure to create the Tomcat, WebSphere, WebLogic, … specific metrics for Connection Pooling, Thread Pooling and Sessions. Just edit your System Profile, click on Measures and select Create Measure. Now pick from the available pre-configured measures under Server-Side Measures – Tomcat/WebLogic/WebSphere
Tomcat Performance Problem #9: Multi-Threading: Locks, Syncs & Wait Issues
The last screenshot above not only showed that we have a lot of threads active in the application. It also showed that these threads are actually not doing a whole lot as CPU is almost not utilized. This could mean several things but most likely that these threads are all either waiting on I/O (when calling external web services or SQL statements), are trying to enter a synchronized code block or are waiting on some other threads to finish.
The following screenshot shows a dashboard I often use when analyzing load tests. Its from a story I blogged about last year. I typically chart the number of threads, the response time (avg and max), number of requests coming as well as looking at the layer breakdown showing me which layer of the app is consuming all the time. It was clear that the app couldn’t scale. The top load was reached at 0:50 which is when the throughput actually started to decline but response time still went up:
Great dashboard to correlate load with number of threads and response time. The Layer Breakdown also allows us to figure out which layers of the apps consume all the time.
Now – what is really causing this issue? JMX metrics typically don’t get you much further than this. The only other options we have is to either take thread dumps at these times when we see response time going up but CPU actually staying low. The thread dumps will show us what the threads are really doing, on which objects/monitors threads are waiting on and which other threads are currently locking these objects/monitors:
Dynatrace provides a built-in Thread Dump Diagnostics Feature showing you which threads own monitors or wait on monitors owned by other threads!
My preferred and typically first go-to solution is to use the Response Time Hotspot. Why? Because it shows me exactly how much time is spent in Sync & Wait per Layer of my application. Check out the following screenshot. Easy to spot that the biggest problem code waiting in the Elastic Search API as well as in their own code:
The Response Time Hotspot view is great to identify if we have Wait, Sync, I/O or CPU issues. Breaking it down into logical layers of the app!
Now from here it is just a single click to the actual wait statement and the code that is actually waiting on the object. We can see that the code that tries to execute a remote service has to wait until the next connection becomes available. So – the side effect of making excessive remoting calls on undersized connection pools will cause your threads to simply wait instead of doing work:
Identify which methods put threads to wait or sync: the real root cause could of course be bad connection pooling or excessive usage of resources such as pools and queues.
If you want to read a bit more on this check out my blog posts on Hidden Class Loading Performance Impact of the Spring Framework and Java Performance Impact By Dynamic Class Loading .
Tip for Dynatrace Users : If you drill down to the PurePath look at the Thread Name Column as well as the Elapsed Time Column. They tell you which thread actually executed which methods and how much time passed between thread and runtime boundaries
Tomcat Performance Problem #10: Memory: Leaks & Garbage Collection Impact
Much has been written about memory leaks, garbage collection and heap space tuning. Because I don’t want to repeat what we and others have said before, I will keep this short and leave you with a screenshot of the first thing I do when I analyze memory leaks. I look at the key memory metrics that every JVM exposes via JMX telling me whether we are suffering from high Garbage Collection or whether we have a potential memory leak:
Diagnosing Memory Leaks and GC issues should always start by looking at these metrics!
If you want to learn more please read my 5 Steps to Analyze Memory Leaks or watch my YouTube Tutorial about Memory Leak and GC Analysis .
Challenge: Prove me wrong or show me a new problem & win a speaking engagement with me!
I renew my offer just as I did in my previous blog posts. I want to give YOU the opportunity to get up on stage with me – at a user group or conference that works for you. All you need to do is to either prove me that your app is not suffering from these problems (demo apps don’t count) or you show me a new problem pattern that I do not yet have on the list.
Challenge Accepted? If so – just sign up for the Dynatrace Personal License . After the 30 days trial period it stays FREE FOR LIFE to analyze your local apps. After you signed up and received the license file (watch out in the spam folder for emails from email@example.com ) you have two options:
- Full Install of Dynatrace in your environment -> Download and Installfrom here!
- Just use the pre-configured Dynatrace Docker Containers on GitHub -> special thanks to my colleague Martin Etmajer !
I also recommend checking out my YouTube Tutorials on What Is Dynatrace and How Does it Work as well as Tomcat Performance Analysis with Dynatrace. Once you have some Dynatrace PurePaths collected share them with me through my Share Your PurePath program.
So c’mon, let’s have some fun with this! Who is up for the challenge? First come – First win!
About The Author
Andreas Grabner Andreas Grabner has been helping companies improve their application performance for 15+ years. He is a regular contributor within Web Performance and DevOps communities and a prolific speaker at user groups and conferences around the world. Reach him at @grabnerandi