1

Suppose the situation that for example we have an instance of some SQL Server (it is not the case what it is). And we have a Java applications that is using the Spring stack.

There are queries that are already optimized but they are still slow due to complex logic of aggregating that data.

I have several approaches in mind (those are short-terms for now):

  1. Proceed with tuning (like creating views) and implement jobs to recalculate these data right in the SQL server for example every 5 minutes and store it in separate table. (Yes it is not so good solution but still).
  2. Implement some kind of mechanism to count / aggregate that data in background. Probably implement one part of Lambda-architecture. I've already looked at Apache Spark and others.

Under optimized it means that those queries are using the correct indexes and everything is 'tuned'.

I know that this is not kind of question as more proposals / discussions. But still I'm questioned.

What is the better way to handle situation like this based on the above?


UPDATE #1

Based on What you can and can't do with Indexed views for MS SQL Server the Indexed view are not the way to go as they do not support COUNT, MIN, MAX, TOP, outer joins, or a few other keywords or elements. You can’t modify the underlying tables and columns. The view is created with the WITH SCHEMABINDING option.

UPDATE #2

After spending some time on this. I've stopped with Materialized Views for now in sake of simplicity.

Yuri
  • 1,748
  • 2
  • 23
  • 26
  • 2
    Do the queries need to return 'live' data, or is some degree of 'stale' data acceptable? If it is the your suggestion 1/ is an ideal solution, and is what lots of large scale, eventually consistent, systems do. They create a read model and update it periodically or separately from the creation of the data. This is the heart of the CQS pattern. – Sam Holder Sep 30 '15 at 16:35
  • Well, of course it is better to return the live data but it is acceptable to return stale data, even 5 minutes will be enough. But I'm aware of the fact that there could be a lot background jobs on server. – Yuri Sep 30 '15 at 16:54
  • you could keep a replica of the database and run these backgrounds jobs on the replica... – Sam Holder Sep 30 '15 at 18:09
  • Okay, I'll give it a try. Will do some basic things with option 1. Thanks for advises and especially for CQS. – Yuri Sep 30 '15 at 18:39

1 Answers1

2

So, different database engines have the concept of a Materialized View. SQL server has the equivalent with it's Indexed Views. These are designed for your exact use case. I would strongly consider these methods before basically "rolling your own" materialized view.

Community
  • 1
  • 1
Rob Conklin
  • 8,806
  • 1
  • 19
  • 23
  • I'm aware of pre-calculate process that depends on multiple other tables. Under calculation I mean counters that I need to fetch at once. – Yuri Oct 01 '15 at 13:27
  • Based on [What you can and can't do with Indexed views](http://www.brentozar.com/archive/2013/11/what-you-can-and-cant-do-with-indexed-views/) YOU CAN’T… The view definition can’t reference other views, or tables in other databases. It can’t contain `COUNT`, `MIN`, `MAX`, `TOP`, outer joins, or a few other keywords or elements. You can’t modify the underlying tables and columns. The view is created with the `WITH SCHEMABINDING` option. – Yuri Oct 01 '15 at 13:32
  • Often when you have a complex query that can't be indexed, like queries with self joins, you can still observe massive gains when you create simpler views that can be indexed and then modify your query to use those views. Note: When I did this in SQL Azure I had to add the WITH (NOEXPAND) table hint. – Chris Schaller Oct 09 '15 at 14:49
  • Thanks for your suggestion. I've stopped on that. But I'm still considering about denormalization in future. – Yuri Jun 13 '16 at 15:18