2

I am working on an Java application which uses MySQL database as the data storage layer. There are few configuration tables in database, but each table has many thousands of records / rows. These all configuration is cached / loaded in memory in corresponding data structures / beans(JAVA POJO's) when application starts up.

Everything is fine except that every time the application starts the caching takes place and this usually takes 15-20 minutes, as the data to be cached is huge and also some columns have XML string which is parsed and then stored in beans.

So what's the big deal??

  • Why should we cache when no data is changed between consecutive start-up's.?? I can have all the beans encapsulated in a common Config bean and serialize it. And load this serialized object the next time when I figure out no data is changed - and yes of course loading serialized object is far faster then database hit plus bean population.

So is there any way I can figure this out? Of course at database level. I would query when the application starts - Was there any change in the database tables since it was last started. If yes do the same old boring caching process and store some unique identifier and serialize, Or if last identifier and current identifier are same just load the serialized object. This unique identifier will of course be persistent.

Xavier DSouza
  • 2,861
  • 7
  • 29
  • 40

2 Answers2

4

Add an last_updated column of type timestamp to the table.

When you need to check if there are changes on the table simply execute the query:

select max(last_updated) from YOUR_TABLE

If the last_updated is after the time you created the last cache copy you can update the cache with only the elements changed since last creation of the cache with a query similar to this one:

select * from YOUR_TABLE where last_updated > LAST_CACHE_UPDATE

As explained in the comments is higly recomandable to add an index on the column last_updated. Using an index give you the possibility to retrieve the maximum value in a table of 1.000.000.000 records in 30 steps (not 1.000.000.000 as wrong mentioned in the comments).


Davide Lorenzo MARINO
  • 26,420
  • 4
  • 39
  • 56
0

If you restart your application a lot and your cache can live in off memory data structure like redis or hazelcast, use that as cache, not the jvm memory. When update data, update both sides.

halil
  • 1,789
  • 15
  • 18
  • It works only if it is possible to update the records passing via redis. If the update is done directly via a sql client it is not taken in consideration – Davide Lorenzo MARINO Feb 12 '16 at 12:50
  • It's the purpose of caching, when one wants to update a record, update both datastore and cache ! – Louis F. Feb 12 '16 at 13:07
  • Only if it possible to access directly the datastore, otherwise it is possible find the best way to see if the data present in the database are changed or not. – Davide Lorenzo MARINO Feb 12 '16 at 13:16