20

I have Tomcat 8.5.9 running on an AWS box with 10 different WebSocket apps deployed that each basically act as a message broker. The https connector is using the Http11NioProtocol. The only parameter I have set is the maxThreads=200 along with the certificate info.

The request volume is not very high. It has been running since Monday morning, and here is what the manager status says:

Max threads: 200
Current thread count: 38
Current thread busy: 0
Keep alive sockets count: 1
Max processing time: 234 ms
Processing time: 17.254 s
Request count: 33351
Error count: 325
Bytes received: 0.00 MB
Bytes sent: 34.07 MB

After a few days, I notice the memory usage continue to grow. I have to restart Tomcat services about every two weeks or so to prevent getting an OutOfMemoryException.

I have been taking heap dumps and analyzing using the Eclipse MAT, which always points to the WsFrameServer class as being the problem suspect. The most recent dump displays the following:

5,146 instances of "org.apache.tomcat.websocket.server.WsFrameServer",
loaded by "java.net.URLClassLoader @ 0x6c0047c28" occupy 1,383,143,200
(73.13%) bytes. These instances are referenced from one instance of
"java.util.concurrent.ConcurrentHashMap$Node[]"

The Dominator Tree is currently has 106,000 entries, most of which are the WsFrameServer class.

Am I doing something wrong or is this "normal"? Are there any specific settings either on Tomcat or on the Connector that I should be setting to prevent this from happening?

Thanks in advance.

EDIT: I'm not sure if this is helpful, but here is what the VisualVM monitor looks like:

VisualVM Monitor

Tommo
  • 977
  • 14
  • 35

3 Answers3

5

Hard to be certain without more detail but this is probably related to your session retention. What I think is happening is that the WsFrameServer which extends WsFrameBase is added into the session.
If you have an unlimited session retention policy then you will eventually run out of memory.

Try setting a non-0 sessionTimeout

Magnus
  • 7,952
  • 2
  • 26
  • 52
  • My conf\web.xml specifies the following: 30 This is not overriden in any of the webapp/WEB-INF/web.xml either. Do I need to specify this elsewhere? – Tommo Jul 31 '17 at 13:12
  • Using the heap dump try and see which class holds the reference to the `ConcurrentHashMap` that is holding all of the `WsFrameServer`s it should shed a bit more light on the issue. Also are you using a persistent sessions manager? – Magnus Aug 01 '17 at 03:06
  • I am not currently using a persistent sessions manager. Checking Eclipse MAT now for the class holding reference to the `ConcurrentHashMap`. – Tommo Aug 01 '17 at 03:13
  • @Tommo so what was the actual solution? Which properties have you updated? – Mark Bramnik Sep 14 '21 at 04:38
0

Code is missing from your question. (especially how you manage websocket connection)

Did you use tomcat in async mode with a list of connection somewhere?

You don't forget to bind close AND error event to a code that remove the faulty connection from the list ?

wargre
  • 4,575
  • 1
  • 19
  • 35
  • I'm not entirely sure what you are asking. As far as I know, I am not using tomcat in async mode. I am maintaining a data structure that contains references to sessions, these sessions are removed from the data structure during a close event. I am NOT removing the sessions during an error event at the moment. – Tommo Aug 04 '17 at 14:01
  • It will be hard to explain. With SSE (not so far from websocket) for instance https://golb.hplar.ch/p/Server-Sent-Events-with-Spring you have emitter.onCompletion(() -> this.emitters.remove(emitter)); emitter.onTimeout(() -> this.emitters.remove(emitter)); It is the way to manage the websocket / SSE emiters. If you use session to store websocket, why do you use websocket for ? session means you are already in a user context, so send back data, no need for websocket. – wargre Aug 04 '17 at 14:14
0

As we all known, Java GC is lazy. Its memory will continue to grow until it can't have any more memory, then a GC will be triggered to collect garbage.

From the screenshot of your VisualVM, we can see the memory usage is relative normal: more memory used as time goes, memory usage dropped after GC.

So I wonder whether your app will really crash because OOM. You may try it in your test environment, and get the OOM JVM dump to analyse, which is more useful.

By the way, I suggest VisualVM over MAT, because MAT will include some unreachable objects as GC root. It will make the memory analysis very inefficient and give different result as other tools, which I met in one of our projects.

Tony
  • 5,972
  • 2
  • 39
  • 58