Unable to dynamically scale Ignite Pods in Kubernetes

Question

We have been experimenting with the number of Ignite server pods to see the impact on performance.

One thing that we have noticed is that if the number of Ignite server pods is increased after client nodes have established communication the new pod will just fail loop with the error below.

If however the grid is destroyed (bring down all client and server nodes) and then the desired number of server nodes is launch there are no issues.

Also the above procedure is not fully dependable for anything other than launching a single Ignite server.

From reading it looks like [this stack over flow][1] post and [this documentation][2] that the issue may be that we are not launching the "Kubernetes service".

Ignite's KubernetesIPFinder requires users to configure and deploy a special Kubernetes service that maintains a list of the IP addresses of all the alive Ignite pods (nodes).

However this is the only documentation I have found and it says that it is no longer current.

Is this information still relevant for Ignite 2.11.1? If not is there some more recent documentation? If this service is indeed needed, are there some more concreate examples and information on setting them up?

Error on new Server pod:

[21:37:55,793][SEVERE][main][IgniteKernal] Failed to start manager: GridManagerAdapter [enabled=true, name=o.a.i.i.managers.discovery.GridDiscoveryManager]
class org.apache.ignite.IgniteCheckedException: Failed to start SPI: TcpDiscoverySpi [addrRslvr=null, addressFilter=null, sockTimeout=5000, ackTimeout=5000, marsh=JdkMarshaller [clsFilter=org.apache.ignite.marshaller.MarshallerUtils$1@78422efb], reconCnt=10, reconDelay=2000, maxAckTimeout=600000, soLinger=0, forceSrvMode=false, clientReconnectDisabled=false, internalLsnr=null, skipAddrsRandomization=false]
    at org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:281)
    at org.apache.ignite.internal.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:980)
    at org.apache.ignite.internal.IgniteKernal.startManager(IgniteKernal.java:1985)
    at org.apache.ignite.internal.IgniteKernal.start(IgniteKernal.java:1331)
    at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2141)
    at org.apache.ignite.internal.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1787)
    at org.apache.ignite.internal.IgnitionEx.start0(IgnitionEx.java:1172)
    at org.apache.ignite.internal.IgnitionEx.startConfigurations(IgnitionEx.java:1066)
    at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:952)
    at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:851)
    at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:721)
    at org.apache.ignite.internal.IgnitionEx.start(IgnitionEx.java:690)
    at org.apache.ignite.Ignition.start(Ignition.java:353)
    at org.apache.ignite.startup.cmdline.CommandLineStartup.main(CommandLineStartup.java:367)
Caused by: class org.apache.ignite.spi.IgniteSpiException: Node with the same ID was found in node IDs history or existing node in topology has the same ID (fix configuration and restart local node) [localNode=TcpDiscoveryNode [id=000e84bb-f587-43a2-a662-c7c6147d2dde, consistentId=8751ef49-db25-4cf9-a38c-26e23a96a3e4, addrs=ArrayList [0:0:0:0:0:0:0:1%lo, 127.0.0.1, fd00:85:4001:5:f831:8cc:cd3:f863%eth0], sockAddrs=HashSet [nkw-mnomni-ignite-1-1-1.nkw-mnomni-ignite-1-1.680e5bbc-21b1-5d61-8dfa-6b27be10ede7.svc.cluster.local/fd00:85:4001:5:f831:8cc:cd3:f863:47500, /0:0:0:0:0:0:0:1%lo:47500, /127.0.0.1:47500], discPort=47500, order=0, intOrder=0, lastExchangeTime=1676497065109, loc=true, ver=2.11.1#20211220-sha1:eae1147d, isClient=false], existingNode=000e84bb-f587-43a2-a662-c7c6147d2dde]
    at org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.duplicateIdError(TcpDiscoverySpi.java:2083)
    at org.apache.ignite.spi.discovery.tcp.ServerImpl.joinTopology(ServerImpl.java:1201)
    at org.apache.ignite.spi.discovery.tcp.ServerImpl.spiStart(ServerImpl.java:473)
    at org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2207)
    at org.apache.ignite.internal.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:278)
    ... 13 more

Server DiscoverySpi Config:

<property name="discoverySpi"> 
            <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi"> 
                <property name="ipFinder"> 
                    <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder"> 
                        <property name="namespace" value="myNameSpace"/> 
                        <property name="serviceName" value="myServiceName"/> 
                    </bean> 
                </property> 
            </bean> 
        </property>

Client DiscoverySpi Configs:

<bean id="discoverySpi" class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi">
        <property name="ipFinder" ref="ipFinder" />
    </bean>

    <bean id="ipFinder" class="org.apache.ignite.spi.discovery.tcp.ipfinder.vm.TcpDiscoveryVmIpFinder">
        <property name="shared" value="false" />
        <property name="addresses">
            <list>
                <value>myServiceName.myNameSpace:47500</value>
            </list>
        </property>
    </bean>

Edit:

I have experimented more with this issue. As long as I do not deploy any clients (using the static TcpDiscoveryVmIpFinder above) I am able to scale up and down server pods without any issue. However as soon as a single client joins I am no longer able to scale the server pods up.

I can see that the server pods have ports 47500 and 47100 open so I am not sure what the issue is. Dows the TcpDiscoveryKubernetesIpFinder still need the port to be specified on the client config?

I have tried to change my client config to use the TcpDiscoveryKubernetesIpFinder below but I am getting a discovery timeout falure (see below).

    <property name="discoverySpi"> 
        <bean class="org.apache.ignite.spi.discovery.tcp.TcpDiscoverySpi"> 
            <property name="ipFinder"> 
                <bean class="org.apache.ignite.spi.discovery.tcp.ipfinder.kubernetes.TcpDiscoveryKubernetesIpFinder"> 
                    <property name="namespace" value="680e5bbc-21b1-5d61-8dfa-6b27be10ede7"/> 
                    <property name="serviceName" value="nkw-mnomni-ignite-1-1"/> 
                </bean> 
            </property> 
        </bean> 
    </property>

24-Feb-2023 14:15:02.450 WARNING [grid-timeout-worker-#22%igniteClientInstance%] org.apache.ignite.logger.java.JavaLogger.warning Thread dump at 2023/02/24 14:15:02 UTC
Thread [name="main", id=1, state=WAITING, blockCnt=78, waitCnt=3]
    Lock [object=java.util.concurrent.CountDownLatch$Sync@45296dbd, ownerName=null, ownerId=-1]
        at java.base@17.0.1/jdk.internal.misc.Unsafe.park(Native Method)
        at java.base@17.0.1/java.util.concurrent.locks.LockSupport.park(LockSupport.java:211)
        at java.base@17.0.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquire(AbstractQueuedSynchronizer.java:715)
        at java.base@17.0.1/java.util.concurrent.locks.AbstractQueuedSynchronizer.acquireSharedInterruptibly(AbstractQueuedSynchronizer.java:1047)
        at java.base@17.0.1/java.util.concurrent.CountDownLatch.await(CountDownLatch.java:230)
        at o.a.i.spi.discovery.tcp.ClientImpl.spiStart(ClientImpl.java:324)
        at o.a.i.spi.discovery.tcp.TcpDiscoverySpi.spiStart(TcpDiscoverySpi.java:2207)
        at o.a.i.i.managers.GridManagerAdapter.startSpi(GridManagerAdapter.java:278)
        at o.a.i.i.managers.discovery.GridDiscoveryManager.start(GridDiscoveryManager.java:980)
        at o.a.i.i.IgniteKernal.startManager(IgniteKernal.java:1985)
        at o.a.i.i.IgniteKernal.start(IgniteKernal.java:1331)
        at o.a.i.i.IgnitionEx$IgniteNamedInstance.start0(IgnitionEx.java:2141)
        at o.a.i.i.IgnitionEx$IgniteNamedInstance.start(IgnitionEx.java:1787)
        - locked o.a.i.i.IgnitionEx$IgniteNamedInstance@57ac9100
        at o.a.i.i.IgnitionEx.start0(IgnitionEx.java:1172)
        at o.a.i.i.IgnitionEx.startConfigurations(IgnitionEx.java:1066)
        at o.a.i.i.IgnitionEx.start(IgnitionEx.java:952)
        at o.a.i.i.IgnitionEx.start(IgnitionEx.java:851)
        at o.a.i.i.IgnitionEx.start(IgnitionEx.java:721)
        at o.a.i.i.IgnitionEx.start(IgnitionEx.java:690)
        at o.a.i.Ignition.start(Ignition.java:353)

Edit 2: I also spoke with an admin about opening client side ports in case that was the issue. He indicated that should not be needed as clients should be able to open ephemeral ports to communicate with the server nodes.
[1]: Ignite not discoverable in kubernetes cluster with TcpDiscoveryKubernetesIpFinder [2]: https://apacheignite.readme.io/docs/kubernetes-ip-finder

Most likely it's not about the KubernetesIpFinder but rather about communication/network issue. — Alexandr Shapkin, Feb 16 '23 at 23:30
Also, why do you use different IpFinders for clients and servers? Are clients deployed differently? They are using a predefined address with shared=false. What's the reason behind that? — Alexandr Shapkin, Feb 16 '23 at 23:35
@AlexandrShapkin, we are hosting these ourselves, we are not using any provider. The reason for the two different finders is that when we tried to change both to TcpDiscoveryKubernetesIpFinder the client was falling to establish communication. I am not sure why shared is set to false, I may try to play with that. — RichardFeynman, Feb 17 '23 at 00:19

score 2 · Answer 1 · answered Feb 20 '23 at 20:41

It's hard to say precisely what the root cause is, but in general it's something related to the network or domain names resolution.

A public address is assigned to a node on a startup and is exposed to other nodes for communication. Other nodes store that address and nodeId in their history. Here is what is happening: a new node is trying to enter the cluster, it connects to a random node, then this request is transferred to the coordinator. The coordinator issues TcpDiscoveryNodeAddedMessage that must circle across the topology ring and be ACKed by all other nodes. That process didn't finish during a join timeout, so the new node is trying to re-enter the topology by starting the same joining process but with a new ID. But, other nodes see that this address is already registered by another nodeId, causing the original duplicate nodeId error.

Some recommendations:

If the issue is reproducible on a regular basis, I'd recommend collecting more information by enabling DEBUG logging for the following package: org.apache.ignite.spi.discovery (discovery-related events tracing)
Take thread dumps from affected nodes (could be done by kill -3). Check for discovery-related issues. Search for "lookupAllHostAddr".
Check that it's not DNS issue and all public addresses for your node are resolved instantly nkw-mnomni-ignite-1-1-1.nkw-mnomni-ignite-1-1.680e5bbc-21b1-5d61-8dfa-6b27be10ede7.svc.cluster.local. I was asking about the provider, because in OpenShift there seems to be a hard limit on DNS resolution time.
Check GC and safepoints.
To hide the underlying issue you can play around by increasing Ignite configuration: network timeout, join timeout, reducing failure detection timeout. But I recommend finding the real root cause instead of treating the symptoms.

I've spent some more time trying to debug this issue and it seems related to the use of the TcpDiscoverySpi, however as I mentioned before when we try to use the TcpDiscoveryKubernetesIpFinder clients are unable to connect to the grid. — RichardFeynman, Feb 24 '23 at 18:43
Are clients and servers deployed in the same network? Maybe clients are behind the NAT? Different Kubernetes namespace? — Alexandr Shapkin, Feb 25 '23 at 14:47
server and client pods are on the same network AND namespace. There is not NAT that I am aware of. What is odd is that client pods can join the cluster when using the TcpDiscoveryVmIpFinder but not the TcpDiscoveryKubernetesIpFinder. The only difference in the two configs is that that the TcpDiscoveryVmIpFinder has a port specified for discovery. I know when running Ignite server and client on the same machine I have had to specify the communication port. Could there be a similar issue here? — RichardFeynman, Feb 27 '23 at 17:22

score 0 · Accepted Answer · answered Mar 03 '23 at 13:18

0

Removing the property <property name="shared" value="false" /> from my client config solved the problem in the end. Still not certain why the TcpDiscoveryKubernetesIpFinder does not work.

answered Mar 03 '23 at 13:18

RichardFeynman

478
1
6
16

Unable to dynamically scale Ignite Pods in Kubernetes

2 Answers2