SQL Server - GROUP BY with HAVING on more than 10 million rows hitting performance

Question

SQL Server - how to improve query performance with GROUP BY and with HAVING clause of 30 + OR conditions on More than 10 million rows

indexing is the best way for performance, also see the Execution Plan from SQL server Management studio to see where is taking much time. — Hasan Mahmood, Mar 20 '19 at 17:32
Take a breath and think about what got you into such a position :) — Jason Goemaat, Mar 20 '19 at 17:37

score 2 · Accepted Answer · answered Mar 20 '19 at 17:46

2

We can be more helpful if you show us your query (you can obfuscate it if you need), but generically you can create computed, persisted, bit columns that pre-calculate the OR statements for you:

https://blog.sqlauthority.com/2016/04/27/sql-server-computed-column-conditions-case-statement/

Instead of:

HAVING [A] > 100000 OR [B] < 1000

Use:

ALTER TABLE [FOO]
ADD IsFiltered AS CASE WHEN [A] > 100000 OR [B] < 1000 THEN 1 ELSE 0 END PERSISTED

And then add [IsFiltered] to an index for extra speed. You can also create functions to perform calculations for you:

formula for computed column based on different table's column

Alternatively, it may be time to bite the bullet and create these calculations in an SSAS cube. That can be a big leap, but cubes can provide a lot of insight into your data.

answered Mar 20 '19 at 17:46

Russell Fox

5,273
1
24
28

considering how bad the question was this is an excellent answer. Good Job! – Hogan Mar 20 '19 at 17:56
1

I don't get it. A computed column cannot calculate an aggregate -- at least efficiently. If the `having` is on the group by keys, then that filtering should be moved to a `where` clause. – Gordon Linoff Mar 20 '19 at 20:05
@GordonLinoff -- I think the point was that if there are calculation to be used in the "condidtions" they can be pre-calculated for each row to save time during the group by step. – Hogan Mar 21 '19 at 12:47
@Hogan . . . Computed columns are not available after the `GROUP BY`. Having computed columns only referenced there is not going to be much help. – Gordon Linoff Mar 21 '19 at 13:13

SQL Server - GROUP BY with HAVING on more than 10 million rows hitting performance

1 Answers1