What are the pros and cons of a distributed second level cache versus focusing on tuning database

Question

we have a website that uses nhibernate and 2nd level cache. We are having a debate as one person wants to turn off the second level cache as we are moving to a multi webserver environment (with a load balancer in front).

One argument is to get rid of the second level cache and focus on optimizing and tuning the Db. the other argument is to roll out a distributed cache as the second level cache.

I am curious to hear folks pro and con here of DB tuning versus distributed cache (factoring in effort involved, cost, complexity, etc)

score 13 · Accepted Answer · answered Jul 07 '11 at 11:14

13

In case of a load balancing scenario you have to use a distributed cache provider to get best performance and consistency, that has nothing to do with optimizing your database. In any scenario you should optimize you database.

answered Jul 07 '11 at 11:14

Peter

27,590
8
64
84

score 8 · Answer 2 · answered Jul 15 '11 at 13:59

Both. You should have a distributed cache to prevent unecessary calls to the database and a tuned database so the initial calls are quickly returned. As an example, facebook required a significant amount of caching to scale, but I'm sure it wouldn't do much good if the initial queries took 10 minutes. :)

score 5 · Answer 3 · answered Jul 07 '11 at 12:20

5

Two words: measure it.

Since you already have cache implement it you can probably measure what the impact would be of turning it off for benchmark purposes.

answered Jul 07 '11 at 12:20

Hector Correa

26,290
8
57
73

score 5 · Answer 4 · answered Jul 18 '11 at 08:01

I would think that a multi-web server and a distributed second level cache can -and probably should- coexist.

First of all if we take as example memcached, it supports distributed object storing so if you're not using that, you could switch to that. it works.

Secondly, I'm guessing that you're introducing the web-server farm to respond to increasing web requests which will in turn mean increasing requests for data. If you kill your caching, it won't matter how much you optimize your database you're going to thrash it with queries. So you are going to improve your execution time, but while you wait for the database to return your data.

This is especially true for the case that web-node 1 requests dataset A and web-node 2 requests dataset A --> you are going to do the same query twice while with second level caching you only do it once.

So my recommendation is:

Don't kill your second level cache. You have already spent resources to implement it and by disabling it you are NOT going to improve your application's performance. Even a single node of memcached is going to be faster than having none at all.

Do optimize your database operations. This means both from the database side (indexes, views, sp's, functions, perhaps a cluster with read-only and write-only nodes) and application side (optimize your queries, lazy/eager loading profiling, don't fetch data you don't need, combine multiple queries into single-round-trips via Future, MutliQuery, MultiCriteria)

Do optimize your second-level cache implementation. There are datasets that have an infinite expiration date, and thus you query the db for them only once, and there are datasets that have short expiration dates, and thus probably expensive queries are executed more frequently. By optimizing your queries and your db you are going to improve the performance for the queries but the second-level cache is going to save your skin on peak load where short-expiration date datasets will be fetched by the cache more frequently.

If using textual queries is an everyday operation use the database's full-text capabilities or, even better, use a independent service like Lucene.NET (which can be integrated with NHibernate via NHibernate.Search)

score 2 · Answer 5 · answered Jul 12 '11 at 11:29

That's a very difficult topic. In either case you need proficiency. Either a very proficient DBA, or a very proficient NHibernate / Cache administrator.

Personally, I prefer having full control over my SQL and tuning the database. Since you only have multiple webservers (and not necessarily multiple database instances), you might be better off that way, too. Modern databases have very efficient caches, so usually you create more harm with badly configured second-level caches in the application, rather than just letting the database cache sql statements, cursors, data, buffers, etc. I have experienced this to work very well for around 15 weblogic servers and only one database with lots of memory.

Since you do have NHibernate already, though, moving away from it, back to SQL (maybe with LINQ?) might be quite a costly task, that's not worth the effort.

score 1 · Answer 6 · edited May 23 '17 at 10:24

1

We use NHibernate's 2nd level cache in our multi-server environment using Microsoft AppFabric distributed cache framework (NHibernate Velocity Provider) with great success.

Having said that, using 2nd level cache requires deeper understanding of the framework to prevent unexpected results. In addition, before using distributed caches, it is important to measure their overhead.

So my answer is basically - before using 2nd-level cache, you should really test and see whether it is really needed.

edited May 23 '17 at 10:24

Community

1
1

answered Jul 19 '11 at 10:34

sternr

6,216
9
39
63

can you highlight some of the gotchas or issues to worry about. also what are the best metrics you used to measure ?? – leora Jul 19 '11 at 22:20

What are the pros and cons of a distributed second level cache versus focusing on tuning database

6 Answers6