0

I would like to speed up our SQL queries. I have started to read a book on Datawarehousing, where you have a separate database with data in different tables etc. Problem is I do not want to create a separate reporting database for each of our clients for a few reasons:

  1. We have over 200, maintenance on these databases is enough
  2. Reporting data must be available immediately

I was wondering, if i could simply denormalize the tables that i report on, as currently there are a lot of JOINs and believe these are expensive (about 20,000,000 rows in tables). If i copied the data into multiple tables, would this increase the performance by a far bit? I know there are issues with data being copied all over the place, but this could also be good for a history point of view.

Cœur
  • 37,241
  • 25
  • 195
  • 267
Gillardo
  • 9,518
  • 18
  • 73
  • 141

3 Answers3

1

Denormalization is no guarantee of an improvement in performance.

Have you considered tuning your application's queries? Take a look at what reports are running, identify places where you can add indexes and partitioning. Perhaps most reports only look at the last month of data - you could partition the data by month, so only a small amount of the table needs to be read when queried. JOINs are not necessarily expensive if the alternative is a large denormalized table that requires a huge full table scan instead of a few index scans...

Your question is much too general - talk with your DBA about doing some traces on the report queries (and look at the plans) to see what you can do to help improve report performance.

N West
  • 6,768
  • 25
  • 40
  • Even something as simple as ensuring you have recent, updated statistics on the tables can help. – N West Apr 22 '14 at 13:14
  • Sorry if question is too general, but our system allows users to generate their own reports, so reports in 1 database maybe completely different to reports in another database. I am afraid we dont have a DBA in the company, thus i am trying to improve the performance myself. I have got indexes on tables that are mostly used generally, but i aint a DBA – Gillardo Apr 22 '14 at 13:24
1

The question is very general. It is hard to answer whether denormalization will increase performance.

Basically, it CAN. But personally, I wouldn't consider denormalizing as a solution for Reporting issues. In my practice business people love to build huuuge reports which would kill OLTP DB in the least appropriate time. I would continue reading Datawarehousing :)

d_z
  • 688
  • 3
  • 11
-1

Yes for OLAP application your performance will improve by denormalization. but if you use same denormalized table for your OTLP application you will see a performance bottleneck over there. I suggest you too create new denormlize tables or materialized view for your reporting purpose and also you can incremently fast refresh your MV so you will get reporting data immediately.

er_suthar
  • 319
  • 1
  • 8
  • 1
    Denormalization, by itself, is no guarantee of a performance gain. – N West Apr 22 '14 at 13:09
  • @NWest in DatawareHouse it guarantees performance improvement as joining 10 tables to generate a report is always costly then fetching the same data from a denormalized table. – er_suthar Apr 22 '14 at 13:14
  • 1
    @NWest, Of course it is no guarantee. Nothing is guarantee without analysis of concrete scheme. However, "In computing, denormalization is the process of attempting to optimize the read performance of a database by adding redundant data or by grouping data." [link](http://en.wikipedia.org/wiki/Denormalization) – d_z Apr 22 '14 at 13:15
  • It all depends on their queries. A FTS of 200 million rows of unnecessary and very wide pages to find the 3 I want, when I can use indexes and joins to quickly throw out 95% of the data that I don't need to scan? – N West Apr 22 '14 at 13:16
  • @NWest, Of course, it depends. However the question was about denormalization specifically. And yes, denormalization CAN increase performance of read queries if done correctly. – d_z Apr 22 '14 at 13:18
  • As well stated in an answer to this question (http://stackoverflow.com/questions/2349270/in-what-way-does-denormalization-improve-database-performance) : "The word "denormalizing" leads to confusion of the design issues. Trying to get a high performance database by denormalizing is like trying to get to your destination by driving away from New York. It doesn't tell you which way to go." – N West Apr 22 '14 at 13:21
  • @NWest yes that is the case of normal OLTP application.In general OLAP(DatawareHouse) appication you don't need 3 rows out of 200 millions rather you have to perform group by and complex operations on set of those 200 million records. – er_suthar Apr 22 '14 at 13:24
  • @NWest, I like the two other answers (22 and 32) better :) – d_z Apr 22 '14 at 13:31
  • 1
    I'm not saying that it can't improve performance. I've built plenty of excellently performing star schemas. It's just dangerous to tell people "denormalize for performance" without giving direction as to *how*. Materialized views of queries (snapshots) as suggested in this answer are a pretty rough way to go and can lead to maintenance nightmares down the road. – N West Apr 22 '14 at 13:35
  • Also note the 249 score here ;-) http://stackoverflow.com/questions/173726/when-and-why-are-database-joins-expensive?rq=1 – N West Apr 22 '14 at 13:38