4

Generally we are aggregating transaction data to one table using SQL Job periodically. Now the customer needs live data so 1 hour or hour is too high. Instead of running job in minutes, team suggest to create a view which includes the logic for selecting data from the transaction table itself.

But my confusion is any performance related difference between select query and View. Putting data in aggregate table on a periodica manner and select from this aggregation is easy because 1. It does not affect main tables

Direct select statement in direct transactions tables impact insert opertions as it frequently happening. If i create a view i think the concept is same as select statement. Now also we are selecting from the main tables. Or my understanding is wrong?

Please suggest the best approach for this and why

Akhil
  • 1,918
  • 5
  • 30
  • 74

2 Answers2

5

If i create a view i think the concept is same as select statement

Correct, a view in its simplest form is nothing more than saved query definition. When this view is used in a query the definition will be expanded out into the outer query and optimised accordingly. There is not performance benefit. This is not true of course for indexed views, where the view essentially becomes it's own table and using the NOEXPAND query hint will stop the definition being expanded, and will simply read from the view's index(es). Since this is an aggregation query though I suspect an indexed view won't even be possible, never mind a viable solution.

The next part about having a table to store the aggregation is more difficult to answer. Yes this could benefit performance, but at the cost of not having up to the minute data, and also having to maintain the table. Whether this is a suitable solution is entirely dependent on your needs, how up to date the data needs to be, how often it is required, how long it takes to (a) populate the reporting table (b) run the query on it's own.

For example if it takes 20 seconds to run the query, but it is only needed twice a day, then there is no point running this query every hour to maintain a reporting table to assist a query that is run twice a day.

Another option could be maintaining this reporting table through triggers, i.e. when a row is inserted/updated, cascade the change to the reporting table there and then, this would make the reporting table up to date, but again, if you are inserting millions of transactions a day and running the report a few times, you have to weigh up if the additional overhead cause by the trigger is worth it.

You could reduce the impact on write operations by using a transaction isolation level of READ UNCOMMITTED for the SELECT, but as with the summary table option, this comes at the cost of having up to the second accurate information, as you will be reading uncommitted transactions.

A final option could be an intermediate option, create a summary table daily, and partition/index your main table by date, then you can get the summary from historic data from the table created daily, and union this with only today's data which should be relatively fast with the right indexes.

GarethD
  • 68,045
  • 10
  • 83
  • 123
  • k i got it. For me i am aggregating data in shiftwise. So iam using job to run every shift or 8 hours. For live data i used View – Akhil Feb 27 '14 at 06:23
1

A good approach is to create the view and profile your DB. Yes the best way to check if the view is a good solution is creating it and testing the results.

A similar question was asked here: Is a view faster than a simple query?

Community
  • 1
  • 1
jean
  • 4,159
  • 4
  • 31
  • 52