Ignite version: 2.14.0
Node configuration: 2 Nodes running on same PC (IPV4) using localhost and 255 available ports:
TcpDiscoveryMulticastIpFinder ipFinder = new TcpDiscoveryMulticastIpFinder();
ipFinder.setAddresses(Collections.singletonList("127.0.0.1"));
Also 2 different working dirs, Threadpool 16, 2 caches (one atomic, one transactional)
What happens: Using ExecutorService i submit 8 threads to pool. Class run correctly (4 on each node) and execute tasks as expected.
But during execution raise, repeatedly and with some frequency, the following exception on both nodes: GRAVE: "Failed to process selector key".
The application generates a high computational load. A simple "for loop" with a sleep gives no error
Full stack follows:
GRAVE: Failed to process selector key [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=3, bytesRcvd=97567668, bytesSent=100128669, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-3, igniteInstanceName=TcpCommunicationSpi, finished=false, heartbeatTs=1675265761563, hashCode=2143442267, interrupted=false, runner=grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=1690656, resendCnt=0, rcvCnt=1696452, sentCnt=1691375, reserved=true, lastAck=1696448, nodeLeft=false, node=TcpDiscoveryNode [id=cd1ffdf0-b9b3-49ef-a9e3-db1676fad428, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47500, /0:0:0:0:0:0:0:1:47500, WOPR/192.168.56.1:47500, /127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1675265584899, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=true, connectCnt=69, queueLimit=4096, reserveCnt=101, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=1690656, resendCnt=0, rcvCnt=1696452, sentCnt=1691375, reserved=true, lastAck=1696448, nodeLeft=false, node=TcpDiscoveryNode [id=cd1ffdf0-b9b3-49ef-a9e3-db1676fad428, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47500, /0:0:0:0:0:0:0:1:47500, WOPR/192.168.56.1:47500, /127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1675265584899, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=true, connectCnt=69, queueLimit=4096, reserveCnt=101, pairedConnections=false], closeSocket=true, outboundMessagesQueueSizeMetric=o.a.i.i.processors.metric.impl.LongAdderMetric@69a257d1, super=GridNioSessionImpl [locAddr=/0:0:0:0:0:0:0:1:47101, rmtAddr=/0:0:0:0:0:0:0:1:56361, createTime=1675265760336, closeTime=0, bytesSent=8479762, bytesRcvd=7459908, bytesSent0=0, bytesRcvd0=0, sndSchedTime=1675265760336, lastSndTime=1675265761545, lastRcvTime=1675265761563, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=o.a.i.i.util.nio.GridDirectParser@b329ba4, directMode=true], GridConnectionBytesVerifyFilter], accepted=true, markedForClose=true]]]
java.io.IOException: Connessione in corso interrotta forzatamente dall'host remoto
at java.base/sun.nio.ch.SocketDispatcher.write0(Native Method)
at java.base/sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:51)
at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:113)
at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:58)
at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:50)
at java.base/sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:466)
at org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processWrite0(GridNioServer.java:1715)
at org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processWrite(GridNioServer.java:1407)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2511)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2273)
at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1910)
at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
at java.base/java.lang.Thread.run(Thread.java:834)
Expected: I read that it could be a configuration problem but I don't understand how to fix it.
The configuration seems simple but and even if the execution is without calculation errors i would like to avoid this exception.
NODE1
[2023-02-02 15:54:30] [AVVERTENZA] Client disconnected abruptly due to network connection loss or because the connection was left open on application shutdown. [cls=class o.a.i.i.util.nio.GridNioException, msg=Connessione in corso interrotta forzatamente dall'host remoto] - [org.apache.ignite.logger.java.JavaLogger warning:]
[2023-02-02 15:54:30] [AVVERTENZA] Unacknowledged messages queue size overflow, will attempt to reconnect [remoteAddr=/127.0.0.1:63660, queueLimit=4096] - [org.apache.ignite.logger.java.JavaLogger warning:]
[2023-02-02 15:54:30] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63670] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63671] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Received incoming connection when already connected to this node, rejecting [locNode=af74d5c9-3631-4fdf-b9f2-0babc853019f, rmtNode=8a378874-f3ae-4d0c-9733-a6b143097658] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63672] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Received incoming connection when already connected to this node, rejecting [locNode=af74d5c9-3631-4fdf-b9f2-0babc853019f, rmtNode=8a378874-f3ae-4d0c-9733-a6b143097658] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63673] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [INFORMAZIONI] Received incoming connection when already connected to this node, rejecting [locNode=af74d5c9-3631-4fdf-b9f2-0babc853019f, rmtNode=8a378874-f3ae-4d0c-9733-a6b143097658] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [GRAVE ] Failed to process selector key [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=0, bytesRcvd=2269317, bytesSent=3928093, bytesRcvd0=1909138, bytesSent0=720914, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-0, igniteInstanceName=TcpCommunicationSpi, finished=false, heartbeatTs=1675349670621, hashCode=722948156, interrupted=false, runner=grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=17152, resendCnt=870, rcvCnt=31061, sentCnt=18796, reserved=true, lastAck=31040, nodeLeft=false, node=TcpDiscoveryNode [id=8a378874-f3ae-4d0c-9733-a6b143097658, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47500, /0:0:0:0:0:0:0:1:47500, WOPR/192.168.56.1:47500, /127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1675349650217, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=true, connectCnt=7, queueLimit=4096, reserveCnt=9, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=17152, resendCnt=870, rcvCnt=31061, sentCnt=18796, reserved=true, lastAck=31040, nodeLeft=false, node=TcpDiscoveryNode [id=8a378874-f3ae-4d0c-9733-a6b143097658, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47500, /0:0:0:0:0:0:0:1:47500, WOPR/192.168.56.1:47500, /127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1675349650217, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=true, connectCnt=7, queueLimit=4096, reserveCnt=9, pairedConnections=false], closeSocket=true, outboundMessagesQueueSizeMetric=o.a.i.i.processors.metric.impl.LongAdderMetric@69a257d1, super=GridNioSessionImpl [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63670, createTime=1675349670241, closeTime=0, bytesSent=720914, bytesRcvd=1909138, bytesSent0=720914, bytesRcvd0=1909138, sndSchedTime=1675349670241, lastSndTime=1675349670277, lastRcvTime=1675349670621, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=o.a.i.i.util.nio.GridDirectParser@16179752, directMode=true], GridConnectionBytesVerifyFilter], accepted=true, markedForClose=true]]] - [org.apache.ignite.logger.java.JavaLogger error:
java.io.IOException: Connessione in corso interrotta forzatamente dall'host remoto
NODE2
[2023-02-02 15:54:30] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47100, rmtAddr=/127.0.0.1:63669] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Received incoming connection from remote node while connecting to this node, rejecting [locNode=8a378874-f3ae-4d0c-9733-a6b143097658, locNodeOrder=1, rmtNode=af74d5c9-3631-4fdf-b9f2-0babc853019f, rmtNodeOrder=2] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Established outgoing communication connection [locAddr=/127.0.0.1:63670, rmtAddr=/127.0.0.1:47101] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [INFORMAZIONI] Established outgoing communication connection [locAddr=/127.0.0.1:63676, rmtAddr=/127.0.0.1:47101] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [INFORMAZIONI] TCP client created [client=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=1, bytesRcvd=84, bytesSent=56, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-1, igniteInstanceName=TcpCommunicationSpi, finished=false, heartbeatTs=1675349671637, hashCode=762674116, interrupted=false, runner=grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%]]], writeBuf=java.nio.DirectByteBuffer[pos=9391 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=31061, resendCnt=753, rcvCnt=17160, sentCnt=31871, reserved=true, lastAck=17152, nodeLeft=false, node=TcpDiscoveryNode [id=af74d5c9-3631-4fdf-b9f2-0babc853019f, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47501, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47501, /0:0:0:0:0:0:0:1:47501, WOPR/192.168.56.1:47501, /127.0.0.1:47501], discPort=47501, order=2, intOrder=2, lastExchangeTime=1675349650060, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=false, connectCnt=8, queueLimit=4096, reserveCnt=9, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=31061, resendCnt=531, rcvCnt=17160, sentCnt=31871, reserved=true, lastAck=17152, nodeLeft=false, node=TcpDiscoveryNode [id=af74d5c9-3631-4fdf-b9f2-0babc853019f, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47501, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47501, /0:0:0:0:0:0:0:1:47501, WOPR/192.168.56.1:47501, /127.0.0.1:47501], discPort=47501, order=2, intOrder=2, lastExchangeTime=1675349650060, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=false, connectCnt=8, queueLimit=4096, reserveCnt=9, pairedConnections=false], closeSocket=true, outboundMessagesQueueSizeMetric=org.apache.ignite.internal.processors.metric.impl.LongAdderMetric@69a257d1, super=GridNioSessionImpl [locAddr=/127.0.0.1:63676, rmtAddr=/127.0.0.1:47101, createTime=1675349671637, closeTime=0, bytesSent=0, bytesRcvd=0, bytesSent0=0, bytesRcvd0=0, sndSchedTime=1675349671637, lastSndTime=1675349671637, lastRcvTime=1675349671637, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=org.apache.ignite.internal.util.nio.GridDirectParser@544beb47, directMode=true], GridConnectionBytesVerifyFilter], accepted=false, markedForClose=false]], super=GridAbstractCommunicationClient [lastUsed=1675349671637, closed=false, connIdx=0]], duration=339ms] - [org.apache.ignite.logger.java.JavaLogger info:]