0

I have a table in SQL Server having 500+ million rows with default clustered index defined on the primary key of table. I am running this simple query which takes more than 30 minutes.

Select count(ledgeridXXX) from Ledger.dbo.tblXXXX 

Here is the chunk of clustered index definition

CONSTRAINT [PK__tblDepar__AE70E0AFF9BAF7B9] PRIMARY KEY CLUSTERED 
(
    [LedgerIDXXX] ASC
) 
WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, 
      ALLOW_PAGE_LOCKS = ON
) ON [PRIMARY]

I don't have any clue where it could be wrong and why is it taking so much time. Please suggest me some ways or techniques on how to fix it as this is a critical table and main reporting heavily relies on this table.

Mahrukh Mehmood
  • 258
  • 2
  • 5
  • 17
  • 2
    Have you tried `count(*)`? Is that column nullable? – shawnt00 Sep 03 '20 at 06:27
  • No, its a primary key of the table. I don't want * as this would increase more time. – Mahrukh Mehmood Sep 03 '20 at 06:42
  • Show us the estimated execution plan using Paste The Plan. – Dale K Sep 03 '20 at 06:45
  • If it's not null then why would that increase anything? – shawnt00 Sep 03 '20 at 06:45
  • It's a myth that has never been true that `count(not_null_column)` is faster than `count(*)` - the performance for both is absolutely identical. So `count(*)` won't change anything, I am quite certain that SQL Server uses the exact same execution plan for both. I guess there is not much you can do, except for throwing faster hardware on the problem or maybe make SQL Server use a parallel threads for the counting. But in the end it has to go all rows in the table to do the counting –  Sep 03 '20 at 06:49
  • @DaleK please see the plan https://www.brentozar.com/pastetheplan/?id=rJhkGzCQP – Mahrukh Mehmood Sep 03 '20 at 06:51
  • @a_horse_with_no_name Basically I wanted to be on a safer side so I was just checking on single column which has index to check the performance. Thank you for the update. – Mahrukh Mehmood Sep 03 '20 at 06:53
  • @shawnt00 I meant to say that I just need a single coulmn count response time to check the sql table performance – Mahrukh Mehmood Sep 03 '20 at 06:53
  • I was thinking that SQL Server did allow a primary key on a nullable column. Maybe that's because I was just reading up on SQLite quirks. I would try these approaches although I believe some only given approximate answers. And honestly I wasn't sure if this had changed in the last few versions. https://stackoverflow.com/questions/54921610/does-a-select-count-query-have-to-do-a-full-table-scan – shawnt00 Sep 03 '20 at 06:54
  • https://learn.microsoft.com/en-us/archive/blogs/martijnh/sql-serverhow-to-quickly-retrieve-accurate-row-count-for-table – shawnt00 Sep 03 '20 at 06:57
  • @shawnt00 I will surely read these articles. My main concern is with the table performance right now and I am clueless – Mahrukh Mehmood Sep 03 '20 at 08:22

2 Answers2

1

if you are just counting the number of rows, you can try this:

SELECT SUM(p.rows) as [cnt]
FROM sys.partitions AS p
INNER JOIN sys.tables AS t ON p.[object_id] = t.[object_id]
INNER JOIN sys.schemas AS s ON t.[schema_id] = s.[schema_id]
WHERE p.index_id = 1 /* clustered index */
AND t.name = N'tblXXXX'AND s.name = N'dbo';

the COUNT(clusteredkey) will probably have some overhead when you have intensive workload. this can take hours scanning the entire 500million rows.

if you are running higher version of SQL Server, you can use the columnstore index to speed up the COUNT.

References:

  1. Bad habits - Count The Hard Way
  2. Fast count - Stackoverflow
  3. Columnstore Index Fast Count
Dan Co
  • 33
  • 5
  • Tried this, with a table of 145M rows, went instant. Thanks for the great query ! – Preben Huybrechts Sep 03 '20 at 07:13
  • This is just a dummy query, this table is used by sp with different requirements – Mahrukh Mehmood Sep 05 '20 at 07:58
  • i provided a couple of technique you can use. You did not mentioned any sp/other requirements on your original question. what do you mean by _"this table is used by sp with different requirementssp and what different requirements"_ ? – Dan Co Sep 06 '20 at 23:22
0

What happens when you do COUNT(Column), COUNT(1), COUNT(*) without the where clause:

SQL Server finds the smallest (containing the least amount of columns, the most rows in a single AK page) index that will do the job.

If you only have a clustered index, this means it will use the clustered index. (The clustered index contains all columns of the table)

What you can do to speed things up:

  1. Create a separate index with only 1 column, preferably a small data type. (The index creation can take lots of time).
  2. Create an indexed view for your query. Since you aren't grouping I wouldn't recommend this.
  3. Specify table hints WITH(TABLOCK) or WITH(TABLOCKX) to force a table lock or exclusive table lock.
Dale K
  • 25,246
  • 15
  • 42
  • 71
Preben Huybrechts
  • 5,853
  • 2
  • 27
  • 63