Environment :
- Tomcat 8 (same results in Tomcat 7)
- JDK 1.7
- Spring MVC 3
- Oracle UCP 11.2.0.3
- Windows Server 2008 R2
I have a single Tomcat 8 instance (same behavior exhibited in Tomcat 7) with two separate Spring MVC applications deployed in it. They share a common starter project and have many things in common, including their use of Oracle UCP, but perform different functions.
Project #1 has been in production for 9 months+ and has never had any speed issues. Project #2 is recently developed, similarly to Project #1, but suffers from a gradual slow down in response times.
The strange part is that, in the same application server, Project #1 runs fine/quickly always whereas Project #2 will gradually slow down over 24 to 48 hours and become so slow it is almost unusable (10+ second response times). Because of this evidence it seems that it is not a Tomcat issue but a project issue.
I don't believe it to be a memory leak because I have :
- Profiled the applications with jVisualVM and found no evidence of memory leaks
- Taken repeated heap dumps and analyzed them with MAT with no evidence of memory leaks
- Manually watched the memory usage and garbage collection and not seen any obvious evidence of memory leaks
- Scanned the code with FindBugs - no detrimental bugs found
- Scanned the code with SonarQube - no detrimental bugs found
- Tomcat's "detect leaks" function finds none
- And, it never seems to run out of memory, it just gets painfully slow. Redeploying the application returns it to normal speeds for a while.
I've also monitored the threads with jVisualVM and do not have any unexpected waiting threads. There are a lot of UCP-Worker-Threads, but I do maintain multiple connection pools with up to 10 connections per pool.
Viewing the Tomcat application specific logs for Project #2, the slowness doesn't seem to be attributed to any particular part of the application. It's just that every operation becomes very slow. For example, decrypting a string in (slow) Project #2 is shown to take over 1 second to execute after the deployment slows down. But, in the (fast) Project #1, the exact same code (same library) decrypts the string in milliseconds (or less than one millisecond). Those two tests were executed within a minute or two of each other.
I've also run (slow) Project #2 in Tomcat by itself (no other applications deployed besides Tomcat Manager) and the same behavior occurs over time. That should eliminate conflicts between the two applications as the cause.
I've been researching for this a long time and am running out of ideas. The main thing that confuses me is how one deployment can run very slowly while a similar deployment in the same instance of Tomcat can run quickly.
Looking for any help, suggestions, pointers, tools to use for more research, info, etc... please : )
(Rough) Timing Comparison
I added a small bit of code to both projects that simply counts to 100 million, appends 1 million characters to a StringBuffer, then returns the timing information in nanoseconds (System.nanoTime()). The following two tests were run back to back :
Fast/Normal Project #1
Counted to 100 million in : 174604ns
Appended 1 million characters to StringBuffer in 13075475ns
Total execution time : 13475805ns
Slow Project #2
Counted to 100 million in : 218954364ns
Appended 1 million characters to StringBuffer in 13098243ns
Total execution time : 450329651ns
Dependency Info
org.apache.poi - 3.9
org.json - 20131018
com.oracle.ucp - 11.2.0.3
com.oracle.ons - 11.2.0.3
com.oracle.ojdbc6 - 11.2.0.3
org.springframework.spring-context - 3.2.3.RELEASE
org.springframework.spring-webmvc - 3.2.3.RELEASE
org.springframework.spring-context-support - 3.2.3.RELEASE
org.aspectj.aspectjrt - 1.7.3
org.aspectj.aspectjweaver - 1.7.3
org.apache.httpcomponents.httpclient - 4.3
org.apache.httpcomponents.httpcore - 4.3
com.google.code.gson - 2.2.2
cglib - 2.2.2
org.codehaus.jackson.jackson-core-asl - 1.9.12
org.codehaus.jackson.jackson-mapper-asl - 1.9.12
com.nimbusds.nimbus-jose-jwt - 2.10.1
org.slf4j.slf4j-api - 1.7.5
ch.qos.logback.logback-classic - 1.0.13
ch.qos.logback.logback-core - 1.0.13
javax.inject - 1
javax.servlet.servlet-api - 2.5
javax.servlet.jsp.jsp-api - 2.1
javax.servlet.jstl - 1.2
org.jsoup - 1.7.2
commons-fileupload - 1.3
com.google.guava - 16.0.1
+1 proprietary internal library
<!-- Test Scope - shouldn't affect deployment -->
junit - 4.11
org.mockito.mockito-core - 1.9.5
org.springframework.spring-test - 3.2.3.RELEASE
org.hsqldb - 2.3.1
commons-dbcp - 1.2.2
org.hamcrest.hamcrest-all - 1.3
com.jayway.jsonpath.json-path - 0.8.1
com.jayway.jsonpath.json-path-assert - 0.8.1
Progress?
While under heavy load today I noticed several HTTP APR threads blocking to wait on logback. (fast) Project #1 does not use logback but log4j instead. That is one of the few dependencies that are different... we may have a culprit. I'll post back when I have more info.