How can I improve the performance of this slow query

Question

I have the following query which takes about 10 minutes. I need this to be much quicker. Any ideas what can be done to tune this query? The r_pos_transaction_head table has a little under 500,000 records, and the r_pos_transaction_detl has a little under 900,000 records.

I have created indexes where I thought appropriate (you can see these in use in the plan).

truncate table t_retail_history
insert into t_retail_history
select
  h.source_db as legacy_source_db,
  h.company as legacy_company,
  h.store_code as legacy_store_code,
  h.register as legacy_register,
  cast(h.register as char(1)) + '/' + cast(h.transaction_no as varchar(10)) as legacy_transaction_no,
  t_warehouse.store_number as store_number,
  h.transaction_no as reference,
  convert(varchar(10),dbo.datefromdays(h.date),103) as transaction_date,
  convert(varchar(5),dateadd(second,h.time,cast(cast(getdate() as date) as datetime)), 108) as transaction_time,
  d.product_code as legacy_product_code,
  coalesce(d.colour_no,0) as legacy_colour_no,
  coalesce(g_colour_name_replacement.new_colour_name,s.colour_name,'') as legacy_colour_name,
  coalesce(d.size_no,0) as legacy_size_no,
  coalesce(s.size_code,'') as legacy_size_code,
  d.price_inc_tax as legacy_price_inc_tax,
  d.sku_no as legacy_sku_no,
  null as barcode,
  d.quantity as qty,
  d.nett_total as sales_total,
  null as person_code,
  t_warehouse.destination_busdiv_prefix
from
  svi.r_pos_transaction_head h
inner join
  svi.r_pos_transaction_detl d on
  d.company = h.company
  and d.store_code = h.store_code
  and d.register = h.register
  and d.tx_code = h.transaction_no
inner join
  svi.g_skus s on
  s.company = h.company
  and s.product_code = d.product_code
  and (
    s.colour_position = d.colour_no
    or s.colour_position is null and d.colour_no = 0
  )
  and (
    s.size_position = d.size_no
    or s.size_position is null and d.size_no = 0
  )
left outer join
  g_colour_name_replacement on
  g_colour_name_replacement.product_code = d.product_code
  and g_colour_name_replacement.old_colour_name = s.colour_name
left outer join
  t_warehouse on
  t_warehouse.legacy_svi_code = right('000' + cast(h.store_code as nvarchar(5)),3)
where
  d.quantity <> 0
  and d.nett_total <> 0

Explain Plan

Any help appreciated!

Have you tried analyse it in the _Databse Engine Tuning Advisor_? — enapi, May 28 '14 at 06:36
Can you tell us if the `SELECT statement` only take `10minutes` too ? Because if you are inserting around 100 millions rows it will be hard to optimize it :p — Ryx5, May 28 '14 at 06:37
@user1994514- No I haven't. I have never used it before, I might take a look. And yes, the `SELECT` takes the same time, so I don't think the insert itself is the issue here. — Lock, May 28 '14 at 06:44
How about replacing all 'whole table' joins with just columns from other tables you need? Or maybe even creating temporary tables with such info and joinin them to main query might help a bit? — glaeran, May 28 '14 at 06:46
you forgot to include r_pos_transaction_head table in that query. — Wumar, May 28 '14 at 06:47
OK I'm running the Tuning Advisor now. It's come back with some partition recommendations and some index recommendations which it think can improve by 26%. I was hoping for a better improvement than that. — Lock, May 28 '14 at 06:52
@glaeran- I did think about this, however all tables used in the joins pretty much have all data that is needed and couldn't "compact" these into smaller temp tables if I wanted to. — Lock, May 28 '14 at 06:52
Why do you need this statement faster? To truncate a table t_retail_history and fill it with initial data should be a one-time task, no? So who cares if it takes 10 minutes or 100? However, if you want it faster, it would be necessary to know how many of the records you get with `d.quantity <> 0 and d.nett_total <> 0`. If it's only a small part, say 2% for instance, then you could re-write this with decodes and build function indexes for it. If it's a big part, say 70%, then consider parallel hints. — Thorsten Kettner, May 28 '14 at 07:13
This join condition looks sub optimal: `t_warehouse.legacy_svi_code = right('000' + cast(h.store_code as nvarchar(5)),3)`. If possible, fix the data in the header table so you can just join on `t_warehouse.legacy_svi_code = h.store_code`... — user1429080, May 28 '14 at 07:14
@user1429080- thanks. Yep, I agree, however I don't really want to update that table as I have to do the '000' + due to a database inconsistency with the source system (all tables they refer to the field as a varchar, however this appears to be the only table where it is an int). This isn't having much of a negative affect. — Lock, May 28 '14 at 07:24

score 3 · Accepted Answer · edited May 23 '17 at 10:32

3

The query is correctly written, as everyone suggest, try to add some indexes on joined fields.

For me the bad part of the query is :

and (
    s.colour_position = d.colour_no
    or s.colour_position is null and d.colour_no = 0
)
and (
    s.size_position = d.size_no
    or s.size_position is null and d.size_no = 0
)

Because OR condition in INNER JOIN statement is performance killer. They are many trick to avoid it (like do 2 left join instead for each condition and then in where clause remove left join that occurs null).

I just do some other research and I found this post on stackoverflow that propose you something. You should try the Union option to not rebuild all your SELECT fields part.

I don't have time to rewrite all the query for you, keep me informed.

edited May 23 '17 at 10:32

Community

1
1

answered May 28 '14 at 06:59

Ryx5

1,366
8
10

Thanks for that. I had a feeling that `OR` would be the performance killer. A DBA at my work said that sometimes, the database can basically run the queries twice, basically doing a `union` between the result sets. I think I may be able to fix this however, by running an update on those source tables and set null values to 0, and then not using an `OR` in the join. I'll have a play around. I appreciate the time taken to help though! – Lock May 28 '14 at 07:03
Haha I love Stackoverflow! I ran an update on mentioned tables above, updating null values in size_position and colour_position to 0.. query took 28 seconds!! Thank-you so much. – Lock May 28 '14 at 07:10

score 0 · Answer 2 · answered May 28 '14 at 06:42

0

You can use indexed views in order to perform beter joining. Perform better indexing so instead of index scan it can use index seek. Image percentage sum is not 100% where is others?

answered May 28 '14 at 06:42

Amir Keshavarz

3,050
1
19
26

score 0 · Answer 3 · answered May 28 '14 at 06:50

0

It seems you don't have the required indexes to support your query. You should see if you can create indexes on the columns where you join.

answered May 28 '14 at 06:50

mordack550

492
4
14

Why do you say that? If you see my plan, it is using indexes almost on all joins. – Lock May 28 '14 at 06:54
Using index probably is index seek not index scan. – Amir Keshavarz May 28 '14 at 06:59
@Lock First of all an index scan _may_ mean that SQL Server could not find a better index to support your joins. Also Hash Joins are the proof that your query is not well supported by indexes. – mordack550 May 28 '14 at 07:03

How can I improve the performance of this slow query

3 Answers3