1

I have an old legacy Java application that several times per week just starts to be very slow and I have to restart Tomcat.

I checked New Relic Top Transactions and Error logs but I can't find the source of the problem, it seems the top transactions are more a consequence than a source of the problem.

So, I suppose it could be a memory leak and I did a heap dump and tried to analyze it on Eclipse Memory Analyser but I'm having difficulties in identifying the memory leak and if it is really a memory leak.

It seams the problem suspect 1 is com.opensymphony.oscache.web.ServletCache.

These are some of the results of Memory Analyser:

Summary

Histogram

Problem Suspect 1

Also, this is VisualVM monitor:

VisualVM Monitor

Thank you! Any help or guidance with this would be very helpful!

This is oscache.properties file:

cache.memory=true
cache.persistence.class=com.opensymphony.oscache.plugins.diskpersistence.HashDiskPersistenceListener
cache.path=/home/oscache/tb
dbeja
  • 366
  • 2
  • 6
  • 15
  • 1
    Perhaps a combination of using `UnlimitedCache` described [here](http://grepcode.com/file/repo1.maven.org/maven2/opensymphony/oscache/2.3/com/opensymphony/oscache/base/algorithm/UnlimitedCache.java?av=f) + the possibility of long-lived entries (not sure where this is configured) allows the cache to grow unbound but rarely (if ever) shrink. – Andrew S Mar 20 '17 at 16:54
  • 1
    The unlimited cache is looking very likely, as @AndrewS indicated. I would expect that, for high traffic, that cache could cause serious trouble. – James Mar 20 '17 at 17:59
  • Ok, so I checked oscache.properties and I don't have there any configuration for cache.capacity, and by default I think it considers to be unlimited. Could this be one source of the problem? What should be the best way to calculate the optimum value for this? – dbeja Mar 20 '17 at 18:39
  • 2
    Define "optimum". ;-) Seriously, everything is better than you having to restart Tomcat all the time. So how about experimenting? How much memory would you like to grant to the cache? 4 GB seems to be way too much for your machine, so how about 1 GB? Check if it makes the application more responsive on the long run without making it slower because of cache misses. If everything is okay, try half the size recursively until you start noticing problems because the cache is too small. Just an idea, but this is what I would do. – kriegaex Mar 23 '17 at 14:13
  • Thanks @kriegaex! That's helpful! But if the application is currently right now using normally above 6GB of memory, if I setup a limit of 1GB would that not limit too much the machine? – dbeja Mar 23 '17 at 15:48
  • I thought you were implying the the 4.2 GB shown in the screenshot were already too much and a sign of a memory leak. Maybe I misunderstood. You know the volumetrics better than I do, so you should adjust it to what makes most sense to you. BTW, I meant you to limit the cache size only, not the whole memory for your VM. – kriegaex Mar 23 '17 at 15:57
  • No, you're right, I suppose there's a memory leak for memory being so high but the system runs well on these value, but sometimes memory just grows above this value and the system gets very slow and I have to restart it. My question is if there's a memory leak, will imposing a cache limit will prevent it to make the system slow or I'm just making the problem appear earlier? I know there's no right answer for this :) I'm just trying to find possible paths to follow. Thank you – dbeja Mar 23 '17 at 16:18
  • @dbeja, AFAIU kriegaex and others suggest that the cache itself is the source of the memory leak. If this is the case, then limiting max cache size most probably will prevent system slowdown over time. So have you tried it? If so what are your results? – SergGr Mar 28 '17 at 22:54
  • I'll try this week. I'm just finding the right time to do it because it's a difficult problem to replicate in a local environment without the users load. I'll try limiting max cache size and also change memory caching to disk caching. I'll give here an update on the results. For now, to keep the system a bit more stable I just doubled cpu and memory, but I know I'm just hiding the problem. – dbeja Mar 29 '17 at 11:20
  • I changed memory to false and added cache.capacity=25000. System is now more stable and fast. Still have some peaks where the system gets slow but didn't have to restart the system. After 5 minutes memory went down. Maybe I need now to fine tune better cache.capacity value. – dbeja Mar 31 '17 at 09:23

2 Answers2

1

Few things that I would suggest to get way with the issue.

Use disk caching instead of memory cache if your use cases lets you to:

In the configuration file for oscache

cache.memory=false
cache.persistence.class=com.opensymphony.oscache.plugins.diskpersistence.DiskPersistenceListener
cache.path=/opt/myapp/cache
cache.capacity=1000

If the disk cache is not recommended try reducing the cache capacity

cache.capacity=1000

Please provide the configuration details of the oscache for a better review if possible.

Update

The HashDiskPersistenceListener is used when the property cahce.memory=false

We have two options to try out

1) provide a value for cache capacity

cache.capacity=1000 #or a value that covers the usecase

2) make the cache use the disk persistance

cache.memory=false
Abdul Rahman
  • 446
  • 2
  • 8
  • Thanks! I updated the question with oscache.properties file. In what use cases do you think disk caching would not be a good solution? – dbeja Mar 28 '17 at 13:52
  • I will try this during this week! I just need to find the right time to do it because it's an app with many users. From the screenshots above what would be your first guess for cache.capacity? Thanks – dbeja Mar 29 '17 at 11:16
  • So, I changed memory to false and added cache.capacity. 1000 was a very low value for the system, and right now I have a cache.capacity of 25000. The system seems to be much more stable. Still, sometimes I get a peak on memory and the system gets slow for some minutes, but I don't need to restart, the system recovers by itself and memory go down again. Maybe fine tuning this capacity limit I can get a bit better performance. – dbeja Mar 31 '17 at 09:22
0

I would suggest to use trial version of YourKit Java Profiler ,it will give you much more detail about your legacy application code. Here is link : I used this tool back in 2014 as a trial version to detect memory leaks in a Web Application based on struts 2 and Hibernate.

Your Kit

Hus Mukh
  • 125
  • 12