2

Ignite version: 2.14.0

Node configuration: 2 Nodes running on same PC (IPV4) using localhost and 255 available ports:

TcpDiscoveryMulticastIpFinder ipFinder = new TcpDiscoveryMulticastIpFinder();
    ipFinder.setAddresses(Collections.singletonList("127.0.0.1"));

Also 2 different working dirs, Threadpool 16, 2 caches (one atomic, one transactional)

What happens: Using ExecutorService i submit 8 threads to pool. Class run correctly (4 on each node) and execute tasks as expected.

But during execution raise, repeatedly and with some frequency, the following exception on both nodes: GRAVE: "Failed to process selector key".

The application generates a high computational load. A simple "for loop" with a sleep gives no error

Full stack follows:

GRAVE: Failed to process selector key [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=3, bytesRcvd=97567668, bytesSent=100128669, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-3, igniteInstanceName=TcpCommunicationSpi, finished=false, heartbeatTs=1675265761563, hashCode=2143442267, interrupted=false, runner=grid-nio-worker-tcp-comm-3-#26%TcpCommunicationSpi%]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=1690656, resendCnt=0, rcvCnt=1696452, sentCnt=1691375, reserved=true, lastAck=1696448, nodeLeft=false, node=TcpDiscoveryNode [id=cd1ffdf0-b9b3-49ef-a9e3-db1676fad428, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47500, /0:0:0:0:0:0:0:1:47500, WOPR/192.168.56.1:47500, /127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1675265584899, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=true, connectCnt=69, queueLimit=4096, reserveCnt=101, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=1690656, resendCnt=0, rcvCnt=1696452, sentCnt=1691375, reserved=true, lastAck=1696448, nodeLeft=false, node=TcpDiscoveryNode [id=cd1ffdf0-b9b3-49ef-a9e3-db1676fad428, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47500, /0:0:0:0:0:0:0:1:47500, WOPR/192.168.56.1:47500, /127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1675265584899, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=true, connectCnt=69, queueLimit=4096, reserveCnt=101, pairedConnections=false], closeSocket=true, outboundMessagesQueueSizeMetric=o.a.i.i.processors.metric.impl.LongAdderMetric@69a257d1, super=GridNioSessionImpl [locAddr=/0:0:0:0:0:0:0:1:47101, rmtAddr=/0:0:0:0:0:0:0:1:56361, createTime=1675265760336, closeTime=0, bytesSent=8479762, bytesRcvd=7459908, bytesSent0=0, bytesRcvd0=0, sndSchedTime=1675265760336, lastSndTime=1675265761545, lastRcvTime=1675265761563, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=o.a.i.i.util.nio.GridDirectParser@b329ba4, directMode=true], GridConnectionBytesVerifyFilter], accepted=true, markedForClose=true]]]
java.io.IOException: Connessione in corso interrotta forzatamente dall'host remoto
    at java.base/sun.nio.ch.SocketDispatcher.write0(Native Method)
    at java.base/sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:51)
    at java.base/sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:113)
    at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:58)
    at java.base/sun.nio.ch.IOUtil.write(IOUtil.java:50)
    at java.base/sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:466)
    at org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processWrite0(GridNioServer.java:1715)
    at org.apache.ignite.internal.util.nio.GridNioServer$DirectNioClientWorker.processWrite(GridNioServer.java:1407)
    at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.processSelectedKeysOptimized(GridNioServer.java:2511)
    at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.bodyInternal(GridNioServer.java:2273)
    at org.apache.ignite.internal.util.nio.GridNioServer$AbstractNioClientWorker.body(GridNioServer.java:1910)
    at org.apache.ignite.internal.util.worker.GridWorker.run(GridWorker.java:125)
    at java.base/java.lang.Thread.run(Thread.java:834)

Expected: I read that it could be a configuration problem but I don't understand how to fix it.

The configuration seems simple but and even if the execution is without calculation errors i would like to avoid this exception.

NODE1

 [2023-02-02 15:54:30] [AVVERTENZA] Client disconnected abruptly due to network connection loss or because the connection was left open on application shutdown. [cls=class o.a.i.i.util.nio.GridNioException, msg=Connessione in corso interrotta forzatamente dall'host remoto] - [org.apache.ignite.logger.java.JavaLogger warning:]
[2023-02-02 15:54:30] [AVVERTENZA] Unacknowledged messages queue size overflow, will attempt to reconnect [remoteAddr=/127.0.0.1:63660, queueLimit=4096] - [org.apache.ignite.logger.java.JavaLogger warning:]
[2023-02-02 15:54:30] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63670] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63671] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Received incoming connection when already connected to this node, rejecting [locNode=af74d5c9-3631-4fdf-b9f2-0babc853019f, rmtNode=8a378874-f3ae-4d0c-9733-a6b143097658] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63672] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Received incoming connection when already connected to this node, rejecting [locNode=af74d5c9-3631-4fdf-b9f2-0babc853019f, rmtNode=8a378874-f3ae-4d0c-9733-a6b143097658] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63673] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [INFORMAZIONI] Received incoming connection when already connected to this node, rejecting [locNode=af74d5c9-3631-4fdf-b9f2-0babc853019f, rmtNode=8a378874-f3ae-4d0c-9733-a6b143097658] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [GRAVE  ] Failed to process selector key [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=0, bytesRcvd=2269317, bytesSent=3928093, bytesRcvd0=1909138, bytesSent0=720914, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-0, igniteInstanceName=TcpCommunicationSpi, finished=false, heartbeatTs=1675349670621, hashCode=722948156, interrupted=false, runner=grid-nio-worker-tcp-comm-0-#23%TcpCommunicationSpi%]]], writeBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=17152, resendCnt=870, rcvCnt=31061, sentCnt=18796, reserved=true, lastAck=31040, nodeLeft=false, node=TcpDiscoveryNode [id=8a378874-f3ae-4d0c-9733-a6b143097658, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47500, /0:0:0:0:0:0:0:1:47500, WOPR/192.168.56.1:47500, /127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1675349650217, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=true, connectCnt=7, queueLimit=4096, reserveCnt=9, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=17152, resendCnt=870, rcvCnt=31061, sentCnt=18796, reserved=true, lastAck=31040, nodeLeft=false, node=TcpDiscoveryNode [id=8a378874-f3ae-4d0c-9733-a6b143097658, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47500, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47500, /0:0:0:0:0:0:0:1:47500, WOPR/192.168.56.1:47500, /127.0.0.1:47500], discPort=47500, order=1, intOrder=1, lastExchangeTime=1675349650217, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=true, connectCnt=7, queueLimit=4096, reserveCnt=9, pairedConnections=false], closeSocket=true, outboundMessagesQueueSizeMetric=o.a.i.i.processors.metric.impl.LongAdderMetric@69a257d1, super=GridNioSessionImpl [locAddr=/127.0.0.1:47101, rmtAddr=/127.0.0.1:63670, createTime=1675349670241, closeTime=0, bytesSent=720914, bytesRcvd=1909138, bytesSent0=720914, bytesRcvd0=1909138, sndSchedTime=1675349670241, lastSndTime=1675349670277, lastRcvTime=1675349670621, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=o.a.i.i.util.nio.GridDirectParser@16179752, directMode=true], GridConnectionBytesVerifyFilter], accepted=true, markedForClose=true]]] - [org.apache.ignite.logger.java.JavaLogger error:
java.io.IOException: Connessione in corso interrotta forzatamente dall'host remoto

NODE2

[2023-02-02 15:54:30] [INFORMAZIONI] Accepted incoming communication connection [locAddr=/127.0.0.1:47100, rmtAddr=/127.0.0.1:63669] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Received incoming connection from remote node while connecting to this node, rejecting [locNode=8a378874-f3ae-4d0c-9733-a6b143097658, locNodeOrder=1, rmtNode=af74d5c9-3631-4fdf-b9f2-0babc853019f, rmtNodeOrder=2] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:30] [INFORMAZIONI] Established outgoing communication connection [locAddr=/127.0.0.1:63670, rmtAddr=/127.0.0.1:47101] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [INFORMAZIONI] Established outgoing communication connection [locAddr=/127.0.0.1:63676, rmtAddr=/127.0.0.1:47101] - [org.apache.ignite.logger.java.JavaLogger info:]
[2023-02-02 15:54:31] [INFORMAZIONI] TCP client created [client=GridTcpNioCommunicationClient [ses=GridSelectorNioSessionImpl [worker=DirectNioClientWorker [super=AbstractNioClientWorker [idx=1, bytesRcvd=84, bytesSent=56, bytesRcvd0=0, bytesSent0=0, select=true, super=GridWorker [name=grid-nio-worker-tcp-comm-1, igniteInstanceName=TcpCommunicationSpi, finished=false, heartbeatTs=1675349671637, hashCode=762674116, interrupted=false, runner=grid-nio-worker-tcp-comm-1-#24%TcpCommunicationSpi%]]], writeBuf=java.nio.DirectByteBuffer[pos=9391 lim=32768 cap=32768], readBuf=java.nio.DirectByteBuffer[pos=0 lim=32768 cap=32768], inRecovery=GridNioRecoveryDescriptor [acked=31061, resendCnt=753, rcvCnt=17160, sentCnt=31871, reserved=true, lastAck=17152, nodeLeft=false, node=TcpDiscoveryNode [id=af74d5c9-3631-4fdf-b9f2-0babc853019f, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47501, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47501, /0:0:0:0:0:0:0:1:47501, WOPR/192.168.56.1:47501, /127.0.0.1:47501], discPort=47501, order=2, intOrder=2, lastExchangeTime=1675349650060, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=false, connectCnt=8, queueLimit=4096, reserveCnt=9, pairedConnections=false], outRecovery=GridNioRecoveryDescriptor [acked=31061, resendCnt=531, rcvCnt=17160, sentCnt=31871, reserved=true, lastAck=17152, nodeLeft=false, node=TcpDiscoveryNode [id=af74d5c9-3631-4fdf-b9f2-0babc853019f, consistentId=0:0:0:0:0:0:0:1,127.0.0.1,192.168.178.30,192.168.56.1:47501, addrs=ArrayList [0:0:0:0:0:0:0:1, 127.0.0.1, 192.168.178.30, 192.168.56.1], sockAddrs=HashSet [host.docker.internal/192.168.178.30:47501, /0:0:0:0:0:0:0:1:47501, WOPR/192.168.56.1:47501, /127.0.0.1:47501], discPort=47501, order=2, intOrder=2, lastExchangeTime=1675349650060, loc=false, ver=2.14.0#20220929-sha1:951e8deb, isClient=false], connected=false, connectCnt=8, queueLimit=4096, reserveCnt=9, pairedConnections=false], closeSocket=true, outboundMessagesQueueSizeMetric=org.apache.ignite.internal.processors.metric.impl.LongAdderMetric@69a257d1, super=GridNioSessionImpl [locAddr=/127.0.0.1:63676, rmtAddr=/127.0.0.1:47101, createTime=1675349671637, closeTime=0, bytesSent=0, bytesRcvd=0, bytesSent0=0, bytesRcvd0=0, sndSchedTime=1675349671637, lastSndTime=1675349671637, lastRcvTime=1675349671637, readsPaused=false, filterChain=FilterChain[filters=[GridNioCodecFilter [parser=org.apache.ignite.internal.util.nio.GridDirectParser@544beb47, directMode=true], GridConnectionBytesVerifyFilter], accepted=false, markedForClose=false]], super=GridAbstractCommunicationClient [lastUsed=1675349671637, closed=false, connIdx=0]], duration=339ms] - [org.apache.ignite.logger.java.JavaLogger info:]
PPUZ
  • 21
  • 2
  • What does it say on the remote host? It says the remote host terminated it. – Stephen Darlington Feb 01 '23 at 17:07
  • Hi Stephen. Thanks for the reply. Yes, I had guessed that. I was wondering why or how to fix this. Processing finishes successfully but this exception is puzzling me and I need help to resolve it. – PPUZ Feb 01 '23 at 22:44
  • If the remote host terminated the connection, it might tell you why. We'd need to see the logs/stack trace from _both_ machines. – Stephen Darlington Feb 02 '23 at 10:39
  • I try to replicate the error and send the log of both nodes. Now I'm trying to raise the logs and limit the message queue. It wasn't limited before. In doing so the error seems to have disappeared but perhaps it is only due to the lower load on the nodes. – PPUZ Feb 02 '23 at 14:31
  • Just added node logs in same time – PPUZ Feb 02 '23 at 14:43
  • Could be the problem caused by such event: "Unacknowledged messages queue size overflow" ? – PPUZ Feb 02 '23 at 14:57

0 Answers0