8

I have been micro-optimizing our page response time on Tomcat and in almost all cases I will see a response time of 50ms if I refresh over and over again but if the page is not hit for a second or two the response time jumps back up to 500ms.

I have seen this same behavior regardless of local, not local, APR, NIO, JIO, static or dynamic responses (ie serving static file or handing the response dynamically). So far I have yet to see this behavior not happen on Tomcat (that is consistent sub 400ms regardless of frequency).

I have used Visual VM also to see if there were any clues.

I thought it was some sort of keep alive but when I run Apache Bench I get even faster (sub 50ms) response times (obviously because its hitting it frequently).

So how do you keep a low latency not frequently hit URL in Tomcat? Perhaps this question is better for ServerFault?

UPDATE: I'm almost positive its a Tomcat 6 issue. I thought I had tested on Tomcat 7 but I just tested again on it and had no issue (see results below). Even the latest Tomcat 6 still has this problem.

Here is the ab output for Tomcat 6 (notice the max):

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:    14   39  45.2     30     314
Waiting:       14   38  45.2     30     314
Total:         14   39  45.2     30     314

Here is ab output for Tomcat 7 notice the max:

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.0      0       0
Processing:    25   38   8.8     37      67
Waiting:       25   37   8.7     36      66
Total:         25   38   8.8     37      67

The Tomcat versions are the only difference (same machine, same JDK, etc...). I thought for sure the latest Tomcat 6 would be fine but it has similar latency on first request.

Adam Gent
  • 47,843
  • 23
  • 153
  • 203
  • It really depends on what your request is doing I'd think. Are you just grabbing an .html file, or are you initializing some sort of data service, ect. – aglassman Feb 21 '13 at 16:53
  • Are you using openjdk? I had some odd issues with it when I swapped to Oracle JDK. – Jaydee Feb 21 '13 at 16:57
  • Maybe there is some kind of full-GC 'stopping the world'... Did you check garbage collector output (verbose GC)? – home Feb 21 '13 at 17:03
  • 2
    Several months ago I was investigating similar issue with the Tomcat server. The average response time of `150ms` was jumping up to `700 - 900ms`. It tuned out that *long* response time was caused by Full GC. Maybe investigation with that in mind will help. – Tom Feb 21 '13 at 17:08
  • According to VisualVM I don't see any Full GCs happening. – Adam Gent Feb 21 '13 at 17:41
  • @Jaydee I'm using openjdk. I will try Sun's soon. – Adam Gent Feb 21 '13 at 17:42
  • Good question; not sure if it's a *coding* question, though. Maybe better for http://webmasters.stackexchange.com/? ;-) – Dave Jarvis Feb 21 '13 at 18:20
  • Hopefully someone will migrate it over. – Adam Gent Feb 21 '13 at 18:50
  • Is it URL specific? If you hit URL X frequently and then try and access URL Y does the holdup happen on URL Y? I'm wondering if it's having to create a new request handling thread (as opposed to reusing an old one) and reinitialize some thread-specific context. – Pace Feb 21 '13 at 21:18
  • @AdamGent Have you tried profiling? – fglez Feb 26 '13 at 12:56
  • Are you running the JVM in server mode? Have you altered the GC algorithms at all? – bh5k Feb 27 '13 at 21:02
  • @fglez yes as I answered in my post I ran VisualVM both looking at where the time was spent and memory and it was very difficult to tell what would cause random latency. That is there was no smoking gun. – Adam Gent Feb 27 '13 at 21:09
  • @bh5k Yes I ran the JVM in server mode. Also as I said I ran the exact same JVM with the same settings for both Tomcat 6 and Tomcat 7. – Adam Gent Feb 27 '13 at 21:11
  • 1
    I just filed a bug report with Oracle (http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=9000216) because com.sun.net.httpserver + URLConnection shows a response delay of approx 38 ms on Ubuntu amd64. The millisecond value in your posting looks suspiciously similar ;-). You could try the code pasted in question http://stackoverflow.com/questions/15235075/delay-in-httpserver-since-java-7/15249797#15249797 . – user1050755 Mar 07 '13 at 01:46
  • My guess would be some sort of pooled resource is being cleaned up when you drop the frequency of the requests. Tomcat has a pool of request handlers, perhaps try setting that larger by default. – wobblycogs Mar 08 '13 at 13:14

1 Answers1

1

Peeking at the tomcat code I decided to search on the word "Weak" on the theory that your problem is that something in a weak reference is collected when you don't re-request quickly.

I came up with the following guess... I found this interesting Class:

http://svn.apache.org/viewvc/tomcat/tc6.0.x/trunk/java/javax/el/BeanELResolver.java?revision=1027187&view=markup

It seems to maintain a cache of BeanProperties objects, a portion of which is handled by a WeakHashMap, and whenever the cache fills up, everything is put in a WeakHashMap and can be garbage collected. If items in the weak map are requested, they are put back in the main map (which is not weak). If your page trips this behavior right at the end of your processing (by causing something like the size of the cache in BeanProperties to be added you could wind up throwing away almost all of the cached bean descriptions.

Conveniently there is a property for tuning this:

private static final String CACHE_SIZE_PROP =
    "org.apache.el.BeanELResolver.CACHE_SIZE";

So maybe try playing with that and see if it influences the behavior? This might not be it however since I didn't see a big change (quick look) in this class in Tomcat 7 where you say your problem disappears. (have you been tweaking this property in your prior tuning efforts?)

Gus
  • 6,719
  • 6
  • 37
  • 58