10

Possible Duplicate:
Performance of COUNT SQL function

Hi all, I've very large tables and I need to know number of records in each , My question is does it reduce the run time if I run :

select count(indexed column like my PK) from tbTest

instead of

select count(*) from tbTest
Community
  • 1
  • 1
Asha
  • 3,871
  • 7
  • 44
  • 57
  • 1
    to be honest I didn't find that question but I think My question is slightly different because it's about indexed column :D thanks – Asha Jul 01 '10 at 16:04

5 Answers5

8

see Performance of COUNT SQL function

The important thing to note is they are not equivalent

Community
  • 1
  • 1
Mike M.
  • 12,343
  • 1
  • 24
  • 28
  • I thought this had been asked before. ;) – NotMe Jul 01 '10 at 15:52
  • This question in the link refers to some_column_name but this question is for count(indexed_column) ... so it will be same – Baaju Jul 01 '10 at 15:57
  • 1
    @Baaju, there are indexed columns besides the PK. – Marcus Adams Jul 01 '10 at 16:09
  • If he is doing it on the PK sure, it's equivalent. If he's doing it on a different column (he only said 'LIKE my PK') then it's not equivalent. One method does not count NULL values. – Mike M. Jul 01 '10 at 18:01
2

Since the question is whether or not there is a performance difference, it would depend on the index. When you do COUNT(*), it will use the PK column(s) to determine the number of rows. If you do not have any indexes besides a clustered index on the PK column(s), it will scan the leaf nodes on the clustered index. That's probably a lot of pages. If you have a non clustered index that is skinnier than the clustered index, it will choose that instead, resulting in less reads.

So, if the column you select is contained in the smallest possible non-clustered index on the table, the SQL query optimizer will choose that for both count() (if you have a clustered ix that is the PK) and count(indexed_column). If you choose a count(indexed_col) that is only contained in a wide index, then the count() will be faster if your PK is a clustered index. The reason this works is that there is a pointer to the clustered index in all non-clustered indexes and SQL Server can figure out the number of rows based on that non-clustered index.

So, as usual in SQL Server, it depends. Do a showplan and compare the queries to each other.

Anon246
  • 1,821
  • 13
  • 16
  • 1
    In other words. The best you do is get the same speed with `count(indexed not null column) from tab` And possibly worse. (Unless stats are massively out of whack.) And with count(*) the query optimizer can change its plan in the event of new better indexes being added. – Shannon Severance Jul 01 '10 at 16:41
1

SELECT COUNT(*) may be faster. That is because using * gives the optimizer liberty to choose any column to count on. Say you have a primary key on a INT column, and a non clustered key on a different bigint column. But the primary key is likely the clustered index, and as such it is in fact significantly larger than the nonclustered bigint index (has more pages). So if the optimizer is free to choose the bigint non-clustered index, it can return the response faster. Possible much faster, depending on the table.

So overall is always better to leave it as COUNT(*) and let the optimizer choose.

Remus Rusanu
  • 288,378
  • 40
  • 442
  • 569
0

most likely, if the query scans the index instead of the whole table.

it is an easy thing to test, become your own scientist.

Rawheiser
  • 1,200
  • 8
  • 17
-1

Both are identical. If you look at the query execution plan for both, both will do an "index scan"

Baaju
  • 1,992
  • 2
  • 18
  • 22