2

We are using a continuous query to transfer data across all clients nodes. However we have a scaling grid so we often run into issue where data node keeps trying to connect to client to send the data from continuous query which has already scaled down. This brings system to a halt as PME operation cannot acquire a lock, so topology doesn't get updated.

In order to resolve this, I want to use parameter TxTimeoutOnPartitionMapExchange which will allow PME to proceed. However in order to utilise this parameter, do i need to change atomicityMode of my caches to transactional? If yes then, will the process of data node trying to send data from continuous query count as a transaction?

In summary, I am trying to work out if TxTimeoutOnPartitionMapExchange parameter help in my situation with continuous query and what will be the steps to enable this parameter.

EDIT: Stacktrace of issue I am trying to solve:

  • Continuous keeps trying to reserve the client and i believe it holds global lock here which blocks cache updates and checkpointing

:

Deadlock: false
    Completed: 1999706
Thread [name="sys-stripe-6-#7%pv-ib-valuation%", id=42, state=WAITING, blockCnt=52537, waitCnt=734400]
        at sun.misc.Unsafe.park(Native Method)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304)
        at o.a.i.i.util.future.GridFutureAdapter.get0(GridFutureAdapter.java:178)
        at o.a.i.i.util.future.GridFutureAdapter.get(GridFutureAdapter.java:141)
        at o.a.i.spi.communication.tcp.TcpCommunicationSpi.reserveClient(TcpCommunicationSpi.java:3229)
        at o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage0(TcpCommunicationSpi.java:3013)
        at o.a.i.spi.communication.tcp.TcpCommunicationSpi.sendMessage(TcpCommunicationSpi.java:2960)
        at o.a.i.i.managers.communication.GridIoManager.send(GridIoManager.java:2100)
        at o.a.i.i.managers.communication.GridIoManager.sendOrderedMessage(GridIoManager.java:2365)
        at o.a.i.i.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1964)
        at o.a.i.i.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1935)
        at o.a.i.i.processors.continuous.GridContinuousProcessor.sendWithRetries(GridContinuousProcessor.java:1917)
        at o.a.i.i.processors.continuous.GridContinuousProcessor.sendNotification(GridContinuousProcessor.java:1324)
        at o.a.i.i.processors.continuous.GridContinuousProcessor.addNotification(GridContinuousProcessor.java:1261)
        at o.a.i.i.processors.cache.query.continuous.CacheContinuousQueryHandler.onEntryUpdate(CacheContinuousQueryHandler.java:1059)
        at o.a.i.i.processors.cache.query.continuous.CacheContinuousQueryHandler.access$600(CacheContinuousQueryHandler.java:90)
        at o.a.i.i.processors.cache.query.continuous.CacheContinuousQueryHandler$2.onEntryUpdated(CacheContinuousQueryHandler.java:459)
        at o.a.i.i.processors.cache.query.continuous.CacheContinuousQueryManager.onEntryUpdated(CacheContinuousQueryManager.java:447)
        at o.a.i.i.processors.cache.GridCacheMapEntry.innerUpdate(GridCacheMapEntry.java:2495)
        at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.updateSingle(GridDhtAtomicCache.java:2657)
        at o.a.i.i.processors.cache.distributed.dht.atomic.GridDhtAtomicCache.update(GridDhtAtomicCache.java:2118)
  • This starts coming up after reserveclient call is struck as it is unable to acquire a lock

:

>>> Possible starvation in striped pool.
    Thread name: sys-stripe-4-#5%pv-ib-valuation%
    Queue: []
    Deadlock: false
    Completed: 6328076
Thread [name="sys-stripe-4-#5%pv-ib-valuation%", id=40, state=WAITING, blockCnt=111790, waitCnt=2018248]
    Lock [object=java.util.concurrent.locks.ReentrantReadWriteLock$NonfairSync@66d8e343, ownerName=null, ownerId=-1]
        at sun.misc.Unsafe.park(Native Method)
        at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.parkAndCheckInterrupt(AbstractQueuedSynchronizer.java:836)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.doAcquireShared(AbstractQueuedSynchronizer.java:967)
        at java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireShared(AbstractQueuedSynchronizer.java:1283)
        at java.util.concurrent.locks.ReentrantReadWriteLock$ReadLock.lock(ReentrantReadWriteLock.java:727)
        at o.a.i.i.processors.cache.persistence.GridCacheDatabaseSharedManager.checkpointReadLock(GridCacheDatabaseSharedManager.java:1663)
        at o.a.i.i.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpiredInternal(GridCacheOffheapManager.java:2715)
        at o.a.i.i.processors.cache.persistence.GridCacheOffheapManager$GridCacheDataStore.purgeExpired(GridCacheOffheapManager.java:2679)
        at o.a.i.i.processors.cache.persistence.GridCacheOffheapManager.expire(GridCacheOffheapManager.java:1051)
        at o.a.i.i.processors.cache.GridCacheTtlManager.expire(GridCacheTtlManager.java:243)
        at o.a.i.i.processors.cache.GridCacheUtils.unwindEvicts(GridCacheUtils.java:873)
        at o.a.i.i.processors.cache.GridCacheIoManager.onMessageProcessed(GridCacheIoManager.java:1189)

So overall my analysis so far is that if a client is gone then continuous query keeps trying to connect holding a lock which blocks everything.

  • Sample page locks dump. Its a similar page link dump everytime and all threads just seem to be waiting and not locked

:

Page locks dump:

Thread=[name=checkpoint-runner-#94%pv-ib-valuation%, id=162], state=WAITING
Locked pages = []
Locked pages log: name=checkpoint-runner-#94%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=checkpoint-runner-#95%pv-ib-valuation%, id=163], state=WAITING
Locked pages = []
Locked pages log: name=checkpoint-runner-#95%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=checkpoint-runner-#96%pv-ib-valuation%, id=164], state=WAITING
Locked pages = []
Locked pages log: name=checkpoint-runner-#96%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=checkpoint-runner-#97%pv-ib-valuation%, id=165], state=WAITING
Locked pages = []
Locked pages log: name=checkpoint-runner-#97%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-0-#15%pv-ib-valuation%, id=50], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-0-#15%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-1-#16%pv-ib-valuation%, id=51], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-1-#16%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-10-#25%pv-ib-valuation%, id=60], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-10-#25%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-11-#26%pv-ib-valuation%, id=61], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-11-#26%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-12-#27%pv-ib-valuation%, id=62], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-12-#27%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-13-#28%pv-ib-valuation%, id=63], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-13-#28%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-14-#29%pv-ib-valuation%, id=64], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-14-#29%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-15-#30%pv-ib-valuation%, id=65], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-15-#30%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-2-#17%pv-ib-valuation%, id=52], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-2-#17%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-3-#18%pv-ib-valuation%, id=53], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-3-#18%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-4-#19%pv-ib-valuation%, id=54], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-4-#19%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-5-#20%pv-ib-valuation%, id=55], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-5-#20%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-6-#21%pv-ib-valuation%, id=56], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-6-#21%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-7-#22%pv-ib-valuation%, id=57], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-7-#22%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-8-#23%pv-ib-valuation%, id=58], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-8-#23%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=data-streamer-stripe-9-#24%pv-ib-valuation%, id=59], state=WAITING
Locked pages = []
Locked pages log: name=data-streamer-stripe-9-#24%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=db-checkpoint-thread-#93%pv-ib-valuation%, id=161], state=TIMED_WAITING
Locked pages = []
Locked pages log: name=db-checkpoint-thread-#93%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=dms-writer-thread-#77%pv-ib-valuation%, id=145], state=WAITING
Locked pages = []
Locked pages log: name=dms-writer-thread-#77%pv-ib-valuation% time=(1674196038673, 2023-01-20 06:27:18.673)


Thread=[name=exchange-worker-#71%pv-ib-valuation%, id=139], state=TIMED_WAITING
Locked pages = []
Locked pages log: name=exchange-worker-#71%pv-ib-valuation% time=(1674196038673, 2023-01-20 06:27:18.673)


Thread=[name=lock-cleanup-0, id=278], state=WAITING
Locked pages = []
Locked pages log: name=lock-cleanup-0 time=(1674196038673, 2023-01-20 06:27:18.673)


Thread=[name=lock-cleanup-scheduled-0, id=171], state=WAITING
Locked pages = []
Locked pages log: name=lock-cleanup-scheduled-0 time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=main, id=1], state=WAITING
Locked pages = []
Locked pages log: name=main time=(1674196038673, 2023-01-20 06:27:18.673)


Thread=[name=query-#5729%pv-ib-valuation%, id=6455], state=WAITING
Locked pages = []
Locked pages log: name=query-#5729%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=query-#5730%pv-ib-valuation%, id=6456], state=WAITING
Locked pages = []
Locked pages log: name=query-#5730%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=query-#5735%pv-ib-valuation%, id=6461], state=WAITING
Locked pages = []
Locked pages log: name=query-#5735%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=query-#5736%pv-ib-valuation%, id=6462], state=WAITING
Locked pages = []
Locked pages log: name=query-#5736%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=sys-stripe-0-#1%pv-ib-valuation%, id=36], state=WAITING
Locked pages = []
Locked pages log: name=sys-stripe-0-#1%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=sys-stripe-1-#2%pv-ib-valuation%, id=37], state=RUNNABLE
Locked pages = []
Locked pages log: name=sys-stripe-1-#2%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=sys-stripe-10-#11%pv-ib-valuation%, id=46], state=WAITING
Locked pages = []
Locked pages log: name=sys-stripe-10-#11%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=sys-stripe-11-#12%pv-ib-valuation%, id=47], state=WAITING
Locked pages = []
Locked pages log: name=sys-stripe-11-#12%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=sys-stripe-12-#13%pv-ib-valuation%, id=48], state=WAITING
Locked pages = []
Locked pages log: name=sys-stripe-12-#13%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=sys-stripe-13-#14%pv-ib-valuation%, id=49], state=WAITING
Locked pages = []
Locked pages log: name=sys-stripe-13-#14%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=sys-stripe-2-#3%pv-ib-valuation%, id=38], state=WAITING
Locked pages = []
Locked pages log: name=sys-stripe-2-#3%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=sys-stripe-3-#4%pv-ib-valuation%, id=39], state=WAITING
Locked pages = []
Locked pages log: name=sys-stripe-3-#4%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=sys-stripe-4-#5%pv-ib-valuation%, id=40], state=WAITING
Locked pages = []
Locked pages log: name=sys-stripe-4-#5%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=sys-stripe-5-#6%pv-ib-valuation%, id=41], state=WAITING
Locked pages = []
Locked pages log: name=sys-stripe-5-#6%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=sys-stripe-6-#7%pv-ib-valuation%, id=42], state=RUNNABLE
Locked pages = []
Locked pages log: name=sys-stripe-6-#7%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=sys-stripe-7-#8%pv-ib-valuation%, id=43], state=WAITING
Locked pages = []
Locked pages log: name=sys-stripe-7-#8%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=sys-stripe-8-#9%pv-ib-valuation%, id=44], state=WAITING
Locked pages = []
Locked pages log: name=sys-stripe-8-#9%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=sys-stripe-9-#10%pv-ib-valuation%, id=45], state=WAITING
Locked pages = []
Locked pages log: name=sys-stripe-9-#10%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)


Thread=[name=ttl-cleanup-worker-#62%pv-ib-valuation%, id=127], state=TIMED_WAITING
Locked pages = []
Locked pages log: name=ttl-cleanup-worker-#62%pv-ib-valuation% time=(1674196038674, 2023-01-20 06:27:18.674)
Lokesh
  • 7,810
  • 6
  • 48
  • 78

1 Answers1

2

TxTimeoutOnPartitionMapExchange is about rolling back active transactions to unlock a PME process. It won't magically unlock every PME that could be stuck due to different reasons.

For sure, it's worth having this setting configured in any case. To enable it, you need to adjust your server nodes' configuration and set this property to some value, like 30 secs. Here is an example of XML changes.

Speaking of the original CQ issue with client disconnects, I'd expect Ignite to handle that automatically with no problems. In other words, I don't think the issue of a hung PME is caused by a continuous query itself, but rather by something else, like yes, active TXs without a timeout.

You don't need to change atomicyMode of your caches. Transactions can't be applied to a non-transactional cache (atomic).

Alexandr Shapkin
  • 2,350
  • 1
  • 6
  • 10
  • Thanks @Alexandr. When you say "You don't need to change atomicyMode of your caches", does it mean that TxTimeoutOnPartitionMapExchange parameter can help even with atomic cache which will not have any transactions? Also in our case Ignite (version: 2.9.1) is not able to handle client disconnects, it starts giving thread pool starvation warnings for striped pool leading to blocked critical system errors. In starvation warnings, it is continuosly trying to reserve client to send continuos query update. I have added stacktrace along with question. Any pointers will be helpful. – Lokesh Jan 22 '23 at 04:29
  • Yes, it still could help because Ignite can start transactions implicitly. There is a special ignite-sys-cache to keep aux information and yes, it's transactional. Overall, it's worth having the mentioned timeout configured anyway. – Alexandr Shapkin Jan 23 '23 at 13:12
  • There might a known issue related to 2.9.1 version though. But to me that looks like a network-related issue. Does this situation resolve eventually? If you wait for some time, like 20 mins? How many public network interfaces you have for your nodes? I suggest specifying them explicitly for communication and discovery using localAddress property to the external IP/domain only. https://ignite.apache.org/docs/latest/clustering/network-configuration#communication. Do you use inverse connection? https://ignite.apache.org/docs/latest/clustering/running-client-nodes-behind-nat – Alexandr Shapkin Jan 23 '23 at 13:31
  • Check that you have clientFailureDetectionTimeout and failureDetectionTimeout to be in sync. They are different for some reason, and they are 2 differentsettingsg applicable only to server nodes. My point is - we want to have a failed client considered as failed as fast as possible, so we are not blocked on trying to establish communication. – Alexandr Shapkin Jan 23 '23 at 13:33
  • clientFailureDetectionTimeout and failureDetectionTimeout is indeed different but the issue i am facing is on server nodes only when it fails to find client nodes, would you recommend to keep them same (i.e. in sync)? – Lokesh Jan 24 '23 at 07:58
  • Whenever a node leaves, I can see 2 types behaviour, node doesn't get struck when i see in log "Node left topology: TcpDiscoveryNode", it keeps working with no logs for retries with "createNioConnection". But when grid gets struck it shows lots of "createNioConnection" logs. My understanding is that when we see "Node Left" log then it means PME got processed successfully by node. My question is that if there are pending messages for previous topology then will PME wait for it or it will get processed? In above i can see it PME went through despite pending msgs for missing nodes. – Lokesh Jan 24 '23 at 08:43
  • Timeouts have to be the same. Having a larger timeout on the server, it might think that a client is still here, but in fact it was failed on the client's side. – Alexandr Shapkin Jan 24 '23 at 11:10
  • Speaking of the NodeLeft. Yes. If a server sees that message, it won't try to reconnect, and you won't have that "reserveClient" stacktrace. Your goal here is to kick the failed client as soon as possible from the server's history, so it won't try to reestablish a connection. That's why I'm suggesting you tune the timeouts, the communication/discovery processing of that client if it has multiple network interfaces (to set LocalAddress) + you might want to decrease failureDetectionTimeout/networkTimeout as well. – Alexandr Shapkin Jan 24 '23 at 11:13
  • We are now working on reducing values for clientFailureDetectionTimeout and failureDetectionTimeout , however we may not be able to put both of them to same value immediately as we want to reduce in steps and see the impact due to possibility of JVM pauses. Will there be any adverse effect on system if these 2 parameters have a gap of 5 seconds ? – Lokesh Jan 29 '23 at 13:55
  • I don't think 5 secs matter a lot. Ideally - match them, a 5 secs gap might be fine. It's better to check in your env and make a decision on the best values. Again, the actual problem might be another. Remember, I pointed you to check the number of network interfaces on your nodes. – Alexandr Shapkin Jan 29 '23 at 22:47
  • Just curious, is it a development project or a production-live one? – Alexandr Shapkin Jan 29 '23 at 22:49
  • I have checked the nodes and there is one network interface card per node. However I believe chances of network traffic being slow are less as grid works perfectly fine when we disable scaling even during peak load. – Lokesh Jan 30 '23 at 05:25
  • This is a production live system and has been running fine for a while but due to heavy increase in load at certain times, we are seeing instability which we are looking to fix. – Lokesh Jan 30 '23 at 05:26