15

I'm running a query like this in MSSQL2008:

select count(*)
from t1
inner join t2 on t1.id = t2.t1_id
inner join t3 on t1.id = t3.t1_id 

Assume t1.id has a NOT NULL constraint. Since they're inner joins and t1.id can never be null, using count(t1.id) instead of count(*) should produce the exact same end result. My question is: Would the performance be the same?

I'm also wondering whether the joins could affect this. I realize that adding or removing a join will affect both performance and the length of the result set. Suppose that without changing the join pattern, you set count to target only one table. Would it make any difference? In other words, is there a difference between these two queries:

select count(*)    from t1 inner join t2 on t1.id = t2.t1_id 
select count(t1.*) from t1 inner join t2 on t1.id = t2.t1_id 

COUNT(id) vs. COUNT(*) in MySQL answers this question for MySQL, but I couldn't find answers for MS-SQL specifically, and I can't find anything at all that takes the join factor into account.

NOTE: I tried to find this information on both Google and SO, but it was difficult to figure out how to word my search.

Community
  • 1
  • 1
Justin Morgan - On strike
  • 30,035
  • 12
  • 80
  • 104
  • Possible Duplicates: http://stackoverflow.com/questions/1221559/count-vs-count1 http://stackoverflow.com/questions/2710621/count-vs-count1-vs-countpk-which-is-better – Michael Fredrickson Jun 29 '11 at 17:16

2 Answers2

17

I tried a few SELECT COUNT(*) FROM MyTable vs. SELECT COUNT(SomeColumn) FROM MyTable with various sizes of tables, and where the SomeColumn once is a clustering key column, once it's in a non-clustered index, and once it's in no index at all.

In all cases, with all sizes of tables (from 300'000 rows to 170 million rows), I never see any difference in terms of either speed nor execution plan - in all cases, the COUNT is handled by doing a clustered index scan --> i.e. scanning the whole table, basically. If there is a non-clustered index involved, then the scan is on that index - even when doing a SELECT COUNT(*)!

There doesn't seem to be any difference in terms of speed or approach how those things are counted - to count them all, SQL Server just needs to scan the whole table - period.

Tests were done on SQL Server 2008 R2 Developer Edition

marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
-1

select count(*) will be slower as it attempts to fetch everything. Specifying a column (PK or any other indexed column) will speed up things as the query engine knows ahead of time what it is looking for. It'll also use an index as opposed to going against the table.

Mrchief
  • 75,126
  • 20
  • 142
  • 189
  • 2
    I think this was true a long time ago, but previous discussions on this topic show that this is no longer the case. Database engines are smart enough to know that count(*) should only check for the existence of a row, and will not fetch all the values for the row. http://stackoverflow.com/questions/1221559/count-vs-count1 – Michael Fredrickson Jun 29 '11 at 17:14
  • 1
    @Micheal: I don't think this is universally true. For example, last time I checked, DB2 on zOS v9 still fetched everything. That was quite some time back. Unless one knows the exact working of the DB he's dealing with, I guess its safe to avoid count(*). – Mrchief Jun 29 '11 at 17:27
  • Ok, I stand corrected - that sounds reasonable that it's not universally true... but OP specified SQL Server 2008. – Michael Fredrickson Jun 29 '11 at 17:34