We have a client server app, 1 server, about 10 clients. They communicate via tcp sockets using custom queries.
The system had been running smooth for many months, but at some point, after the daily scheduled server FULL GC that took about 50s, we figured out that the time between the queries sent by the client and responses received from the server was large, > 10-20s. After some 3 hours the system recovered, everything was running fine again.
While investigating the issue, we found:
- No garbage collection problems on both clients and server
- Query processing time on server was small.
- Load on server was high.
- The network bandwidth was not saturated.
- Connections were not reset during the FULL GC (the daily FULL GC was a normal event until then)
- The machine and OS changed recently from Centos 6 (kernel 2.6.32) to Centos 7 (kernel 3.10.0), but the new config was extensivelly tested. Also Oracle JDK version changed from 1.7.65 to 1.7.75.
We took a thread dump on the server:
java.lang.Thread.State: RUNNABLE
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at util.network.BytesBasedSocketConnection$ReadConnectionRunnable.run(BytesBasedSocketConnection.java:293)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
The FilterInputStream.read()
is the following:
public int read() throws IOException {
return in.read();
}
The in
in our code is a BufferedInputStream
.
The questions are: Why most connections slowed after the Full GC pause? Why the stacktrace ends in FilterInputStream.read()
? Shouldn't it end somewhere in the BufferedInputStream
or in socket input stream? Can this read lead to high load on server?
The code we use for reading:
int constructLength = _socketDIS.readInt();
ByteArrayOutputStream constructBOAS = new ByteArrayOutputStream(constructLength);
for (int i = 0; i != constructLength; i++)
constructBOAS.write(_socketDIS.read());
constructBOAS.close();
byte[] bytes = constructBOAS.toByteArray();
where:
_socketDIS = new DataInputStream(new BufferedInputStream(_socket.getInputStream()));
Here is the stacktrace from the well working client connections:
java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:152)
at java.net.SocketInputStream.read(SocketInputStream.java:122)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:235)
at java.io.BufferedInputStream.read(BufferedInputStream.java:254)
- locked <0x00007f522cbebca8> (a java.io.BufferedInputStream)
at java.io.DataInputStream.readInt(DataInputStream.java:387)
at util.network.BytesBasedSocketConnection$ReadConnectionRunnable.run(BytesBasedSocketConnection.java:287)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
at java.util.concurrent.FutureTask.run(FutureTask.java:262)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:178)
at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:292)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:745)
UPDATE:
Regarding the EJP answer:
There was no EOS involved, the connections were up, but they were very slow
Even if there was an EOS I can't see how the code could spin at the EOS, the
for
is bounded by theconstructLength
value. But still, the suggested improvement is valid.The stacktrace with the problem ends in a read done on a
DataInputStream
((_socketDIS.read()
) that is inherited fromFilterInputStream.read()
, see code above.DataInputStream
, notBufferedInputStream
is missing theread()
. Here inFilterInputStream.read()
there is ain.read()
called on aBufferedInputStream
, this one has its ownread()
method defined. But the stacktrace stops in the middle, is not reachingBufferedInputStream.read()
. Why?