I’m trying to figure out which cache concurrency strategy should I use for my application (for entity updates, in particular). The application is a web-service developed using Hibernate, is deployed on Amazon EC2 cluster and works on Tomcat, so no application server there.
I know that there are nonstrict-read-write \ read-write and transactional cache concurrency strategies for data that can be updated and there are mature, popular, production ready 2L cache providers for Hibernate: Infinispan, Ehcache, Hazelcast.
But I don't completely understand the difference between the transactional and read-write caches from the Hibernate documentation. I thought that the transactional cache is the only choice for a cluster application, but now (after reading some topics), I'm not so sure about that.
So my question is about the read-write cache. Is it cluster-safe? Does it guarantee data synchronization between database and the cache as well as synchronization between all the connected servers? Or it is only suitable for single server applications and I should always prefer the transactional cache?
For example, if a database transaction that is updating an entity field (first name, etc.) fails and has been rolled back, will the read-write cache discard the changes or it will just populate the bad data (the updated first name) to all the other nodes? Does it require a JTA transaction for this?
The Concurrency strategy configuration for JBoss TreeCache as 2nd level Hibernate cache topic says:
'READ_WRITE` is an interesting combination. In this mode Hibernate itself works as a lightweight XA-coordinator, so it doesn't require a full-blown external XA. Short description of how it works:
- In this mode Hibernate manages the transactions itself. All DB actions must be inside a transaction, autocommit mode won't work.
- During the flush() (which might appear multiple time during transaction lifetime, but usually happens just before the commit) Hibernate goes through a session and searches for updated/inserted/deleted objects. These objects then are first saved to the database, and then locked and updated in the cache so concurrent transactions can neither update nor read them.
- If the transaction is then rolled back (explicitly or because of some error) the locked objects are simply released and evicted from the cache, so other transactions can read/update them.
- If the transaction is committed successfully, then the locked objects are simply released and other threads can read/write them.
Is there some documentation how this works in a cluster environment?
It seems that the transactional cache works correctly for this, but requires JTA environment with a standalone transaction manager (such as JBossTM, Atomikos, Bitronix), XA datasource and a lot of configuration changes and testing. I managed to deploy this, but still have some issues with my frameworks. For instance, Google Guice IoC does not support JTA transactions and I have to replace it with Spring or move the service to some application server and use EJB.
So which way is better?
Thanks in advance!