1

I have a table with customer receipts. I'm trying to generate a report based on the user's name, address, and purchases total based by department. The desired output should look like


|Customer       |Address                | Clothing  | Electronics | Hardware | Household |
|Homer Simpson  | 724 Evergreen Terr    | $42       | $20         | $500     | $24       |  
|Walter White   | 308 Negra Arroyo Lane | $120      | $80         | $52      | $2400     |  

The receipts table is part of a temporal model. So, the code looks like:

Select c.customername,a.address,r.receiptno,ir.department,ir.total
from customer c
inner join customer_address_lnk cal on cal.customerid = c.id
inner join address a on cal.addressid = a.id
inner join customer_receipts_lnk crl on crl.customerid = c.id
inner join receipts r on crl.receiptid = r.id
inner join receipts_receiptitem_lnk rrl on rrl.receiptid = r.id
inner join receiptitem ri on ri.id = rrl.receiptitemid

The lnk tables are linking tables.

The receiptitem table has the following columns: ID, Department, Amount, CreatedDate, UpdatedDate

The idea is that if the receipt is updated, the updated amount can be adjusted for returns, price adjustments, and so forth.

The goal is to get the query under 5 sec. Since we have over 125 million rows in the receiptitems table alone, it takes SQL 20+ minutes to calculate the report.

I've tried CTE's on views without success. I've tried different JOIN orders. I've used LEFT Joins. Even Pivot didn't slow it down. I still can't get it under 20 minutes.

Before I start down the path of creating a Function to get it under the 5 second goal, I'm open to any suggestions. I have limited ability to alter indices at this time.

Any thoughts?

CDspace
  • 2,639
  • 18
  • 30
  • 36
Xenoranger
  • 421
  • 5
  • 22
  • No WHERE filters? You say you have a 125 million row table. How big is the output of the query? Even if the server processes all those rows in 5 seconds, I'm pretty sure you can't pipe it down the network and process it in your application within the allotted time. – kirchner Nov 09 '17 at 18:34
  • The query I showed is just to give an idea of how the database links. I do use where filters to limit to 1-2 customers. So where c.name like 'Walter White'. We also have a customer number that I can use to pull the customer by. – Xenoranger Nov 09 '17 at 19:30
  • Wait, you run this for *one* customer and it takes 20+ minutes? How big is the result set for one customer? Can you post the query execution plan for one customer? – Bacon Bits Nov 09 '17 at 20:23
  • I can't post too much information as the actual setup is proprietary. What I can say is that the basic model I've provided covers the system. – Xenoranger Nov 10 '17 at 23:28

1 Answers1

2

Well, obviously views and SQL functions are different things.

Try to use a function where it needs to be clear to a user in the future (maybe yourself!) that the data returned requires certain parameters where the data does not make sense without those parameters. Sort of like forcing the user to include a WHERE clause.

In your example, you may want to force the user to filter by CustomerId or ReceiptId.

HOWEVER....

In this case, the view approach would probably be better.

  1. Functions, by design, do not use temporary tables, but use table variables instead. Tables as variables are much slower than temp tables.
  2. The query you've included is really straight forward with no surprises. The view would be the simplest and best approach here.

For 125M rows, I suggest either checking execution plan during processing (include a WHERE clause for this) or dumping data into a summary table that is updated periodically. Or both. Check indexes all along the way.

Here is more (better) discussion Test SQL Queries

Andy In NC
  • 112
  • 1
  • 8