44

What is the best strategy to refactor a Singleton object to a cluster environment?

We use Singleton to cache some custom information from Database. Its mostly read-only but gets refreshed when some particular event occurs.

Now our application needs to be deployed in a Clustered environment. By definition, each JVM will have its own Singleton instance. So the cache may be out-of-sync between the JVM's when a refresh event occurs on a single node and its cache is refreshed.

What is the best way to keep the cache's in sync?

Thanks.

Edit: The cache is mainly used to provide an autocomplete list (performance reasons) to UI and we use Websphere. So any Websphere related tips welcome.

Amro
  • 123,847
  • 25
  • 243
  • 454
lud0h
  • 2,370
  • 6
  • 33
  • 41

9 Answers9

16

Replace your singleton cache with a distributed cache.

One such cache could be JBoss Infinispan but I'm sure that other distributed cache and grid technologies exist, including commercial ones which are probably more mature at this point.

For singleton objects in general, I'm not sure. I think I'd try to not have singletons in the first place.

Chris Vest
  • 8,642
  • 3
  • 35
  • 43
10

The simplest approaches are:

  1. Add an expiry timer to your singleton cache so that every so often the cache gets purged and subsquent calls fetch the updated data from source (e.g. a database)

  2. Implement a notification mechanism for the cache using something like a JMS topic/tibRV. Get each cache instance to subscribe and react to any change messages broadcast on this topic.

pjp
  • 17,039
  • 6
  • 33
  • 58
  • Can you elaborate on 2? You mean JMS pub/subscribe model? – lud0h Jul 28 '09 at 14:11
  • Yes solution 2 is essentially a way of using a pub/sub mechanism for broadcasting changes to the individual cache instances. You'd need to create a JMS topic running on the application server that is subscribed to by each of the caches. When that data changes a message would need to be published to the topic. Each subscriber would then receive this message and update the local caches accordingly. – pjp Jul 28 '09 at 14:28
  • If your data doesn't change very often then i'd go for option 1. I've worked on several systems using this approach for refreshing reference data. I believe that we used to refresh the caches around every 30 minutes. The refresh period you chose will obviously be based around how your reference data is being used. – pjp Jul 28 '09 at 14:32
  • 1
    2 sounds interesting, but how do you handle the case of having potentially different singleton states for some period of time between instances? i.e. It takes time to inform the subscribed singleton instances, during which time they could potentially have different state information (think cache). – Shane Feb 25 '10 at 20:54
8

You could use the DistributedMap that is built into WAS.

-Rick

Rick
  • 3,830
  • 1
  • 18
  • 16
4

Or something like memcached

http://www.danga.com/memcached/

What is memcached? memcached is a high-performance, distributed memory object caching system, generic in nature, but intended for use in speeding up dynamic web applications by alleviating database load.

Danga Interactive developed memcached to enhance the speed of LiveJournal.com, a site which was already doing 20 million+ dynamic page views per day for 1 million users with a bunch of webservers and a bunch of database servers. memcached dropped the database load to almost nothing, yielding faster page load times for users, better resource utilization, and faster access to the databases on a memcache miss.

monkey_p
  • 2,849
  • 2
  • 19
  • 16
1

If possible, use your app server's support for this, if possible (some have it, some don't). For example, we use JBoss's support for an "HA Singleton" which is a service that only runs on the cluster master node. It's not perfect (you have to handle the case where occasionally it brain farts), but it's good enough.

Failing that, you may be able to engineer something using JGroups, which provides with cluster node auto-discovery and negotiation, but it's non-trivial.

As a last resort, you can use database locking to manage cluster singletons, but that's seriously fragile. Not recommended.

As an alternative to a cluster singleton, you could use a distributed cache instead. I recommend JBossCache (which doesn't need JBoss app server to run) or EhCache (which now provides a distribution mechanism). You'll have to reengineer your cache to work in a distributed way (it won't magically just work), but it's probably going to be a better solution than a cluster singleton.

skaffman
  • 398,947
  • 96
  • 818
  • 769
1

I'm with Mr. Vest Hansen on this one, move as far away from singletons as you possibly can. After being plaguged with the nightmare that is SAAJ and JAXP and getting compatible versions working on JBoss, I'm done with singletons and factories. A SOAP message shouldn't need a factory to instantiate it.

Okay, rant over, what about memcache or something similar? What sort of affinity do you need for your cache? Is it bad if it's EVER out of date, or is there some flexibility in how out of date the data can get?

Chris K
  • 11,996
  • 7
  • 37
  • 65
  • We use it for an autocomplete list, so the users will not see the changes. Thx for your feedback. – lud0h Jul 28 '09 at 13:59
1

There are several ways to handle this, depending on 1) how out of data the data is, and 2) does every instance need to have the same values all of the time.

If you just need data that is reasonably up to data, but every JVM doesn't need to have matching data, you can just have every jvm refresh its data on the same schedule (e.g., every 30 seconds).

If the refresh needs to happen at about the same time, you can have one jvm send out a message to the rest of them saying "its time to refresh now"

If every jvm always needs the same information, you need to do a sync, where the master says "refresh now", all of the caches block any new queries, refresh, and tell the master that they are done. When the master gets an answer back from every member of the cluster, it sends another message that says to proceed.

KeithB
  • 16,577
  • 3
  • 41
  • 45
  • Every instance needs some data, otherwise the users will not see new changes. Can you elaborate little more on keeping JVM's in sync. What kind of sub/notify available? Thx. – lud0h Jul 28 '09 at 14:02
  • 1
    ^Every instance needs some data^ -> Every instance needs *same* data – lud0h Jul 28 '09 at 14:04
  • 1, Can you please explain how you are telling to other JVM to refresh. Are you using pub/sub approach? 2, What if need updated data without any delay. i.e. If one thread in JVM1 is updating the data and that next minute JVM2 needs that data. how to handle this situation. – Dhrumil Shah Nov 16 '16 at 05:19
1

I'm facing a similar situation, but I'm using Oracle's WebLogic and Coherence.

I'm working over a web application that uses an hashmap with cached data read from the database (text to show on webform's labels). To accomplish this, the developers used a singleton instance where they stored all this information. This worked well on a single server environment, but now we want to go into cluster solution and I'm facing this issue with this singleton instance.

From what I've read by now, this is the best solution to accomplish what I want. I hope this helps you with your problem, too.

XpiritO
  • 2,809
  • 5
  • 28
  • 34
0

There are products for having a distributed in memory cache (such as memcache) that can help in this situation.

A better solution, if possible, may be to have the singletons not really be single, but have the application tolerate having separate instances (say that all recognize when they need to be refreshed) but not that they have to be in sync across JVMs, that can turn your cache into a bottleneck.

Yishai
  • 90,445
  • 31
  • 189
  • 263
  • Yeah the trick part is "that all recognize when they need to be refreshed" ...JMS needs a Messaging provider, looks RMI may be the only option. Any other ideas? (other than jGroups/Terracota) and so on...i.e. withtout external dependencies? – lud0h Aug 01 '09 at 16:42