3

We have a big web application in Java which uses Gemfire and spring-data-gemfire. We run gemfire in a client server configuration.

We have the following problem: During startup, in the bean wiring phase, spring-data-gemfire wants to connect to the locator of gemfire. However, the locator may not be up yet. The application will then throw a com.gemstone.gemfire.cache.NoSubscriptionServersAvailableException: Primary discovery failed exception.

This causes a slow and fragile startup procedure of our services, which is inconvenient, especially during our automated tests.

Is there any good solution to have the client wait and periodically poll until the locator is running?

Jesse van Bekkum
  • 1,466
  • 3
  • 15
  • 24
  • 1
    You may want to try setting the GemFire property `locator-wait-time`. I believe this was introduced in 8.1 and allows a server to wait the specified time for a locator to become available. – Jens D Mar 04 '16 at 15:30

1 Answers1

3

As Jens D comments, you can try the locator-wait-time GemFire (System) property. However, as the documentation points out...

The number of seconds that a member should wait for a locator to start if a locator is not available when attempting to join the distributed system. Use this setting when you are starting locators and peers all at once. This timeout allows peers to wait for the locators to finish starting up before attempting to join the distributed system.

This specifically refers to a "peer member" joining the distributed system/cluster, and thus, may have not any effect from the client (cache).

In which case, I have employed other techniques using Spring (specifically in integration tests involving a client/server topology) to cause the client to block waiting for the Server (or Locator) to become available. In my tests, the test forks a separate GemFire JVM process to run the Server while the test VM serves as the cache client.

You can see examples of this in my most recent development effort by integrating GemFire with Spring Session, specifically, in the httpsession-gemfire-clientserver sample.

Here, I used a BeanPostProcessor that causes the client cache, and specifically the PoolFactoryBean/Pool, to block (in postProcessBeforeInitialization(..)) preventing the Pool from being fully initialized until the Server is available (could also apply to a Locator).

The wait just attempts to open a Socket connection to the Server (or Locator) to verify connectivity.

Another approach is to create a CountDownLatch, use it in a registered GemFire ClientMembershipListener and combine it with, again, the BeanPostProcessor, only in the postProcessAfterInitialization(..) method this time.

Technically, only 1 of the 2 approaches are necessary. While, I used this for testing purposes, this can be used for, and is not uncommon in, an actual application as well.

Ideally, however, you are starting your Locators before anything else, since forming a cluster depends on it.

Hope this helps.

Cheers! John

Community
  • 1
  • 1
John Blum
  • 7,381
  • 1
  • 20
  • 30