22

I have a table, VehicleModelYear, containing columns id, year, make, and model.

The following two queries work as expected:

SELECT DISTINCT make, model
FROM VehicleModelYear

SELECT COUNT(DISTINCT make)
FROM VehicleModelYear

However, this query doesn't work

SELECT COUNT(DISTINCT make, model)
FROM VehicleModelYear

It's clear the answer is the number of results returned by the first query, but just wondering what is wrong with this syntax or why it doesn't work.

csab
  • 582
  • 1
  • 8
  • 14
  • 1
    possible duplicate of [Counting DISTINCT over multiple columns](http://stackoverflow.com/questions/1471250/counting-distinct-over-multiple-columns) – Tim Schmelter Sep 23 '13 at 21:14

3 Answers3

45

COUNT() in SQL Server accepts the following syntax

COUNT(*)
COUNT(colName)
COUNT(DISTINCT colName)

You can have a subquery which returns unique set of make and model that you can count with.

SELECT  COUNT(*)
FROM
        (
            SELECT  DISTINCT make, model
            FROM    VehicleModelYear
        ) a

The "a" at the end is not a typo. It's an alias without which SQL will give an error ERROR 1248 (42000): Every derived table must have its own alias.

Nav
  • 19,885
  • 27
  • 92
  • 135
John Woo
  • 258,903
  • 69
  • 498
  • 492
19

Try combining them into a single field:

SELECT COUNT(DISTINCT make + ' ' + model)
FROM VehicleModelYear
Yuriy Galanter
  • 38,833
  • 15
  • 69
  • 136
  • 10
    or `SELECT COUNT(DISTINCT CHECKSUM(make, model)) FROM VehicleModelYear` – Tim Schmelter Sep 23 '13 at 21:21
  • 1
    When I can find a scalar expression that maps back to columns, when the columns are strings or easily castable, and when there are few columns in the expression, I use this. It's quick and dirty. – Iain Samuel McLean Elder Mar 19 '14 at 23:55
  • If not referring to columns which are strings, for example int columns - this answer doesn't result in the correct value but won't generate an error when being executed. As @Ian Elder's comment states,the columns could be casted in order to get this to work correctly. Anyone have some info on why this doesn't fail (for int columns) and what's going on? – Davie Brown Dec 15 '15 at 10:23
  • 2
    Checksum is better as you do not have to worry about casting. – bfs Apr 26 '16 at 15:43
  • Be sure to run a performance comparison of the checksum solution vs just wrapping the multi column distinct into another sub query and doing a count on that. – m12lrpv Feb 13 '18 at 01:00
6

The syntax is valid SQL but has not been implemented in SQL-Server.

Try a rewrite:

; WITH cte AS
  ( SELECT DISTINCT make, model
    FROM VehicleModelYear
  )
SELECT COUNT(*) AS total
FROM cte ; 

or:

; WITH cte AS
  ( SELECT COUNT(DISTINCT model) AS cnt
    FROM VehicleModelYear
    GROUP BY make
  )
SELECT SUM(cnt) AS total
FROM cte ; 

or:

; WITH cte AS
  ( SELECT NULL AS n
    FROM VehicleModelYear
    GROUP BY make, model
  )
SELECT COUNT(*) AS total
FROM cte ; 

And finally, the 2nd and 3rd query above modified, without subqueries:

SELECT DISTINCT
    SUM(COUNT(DISTINCT model)) OVER () AS total
FROM VehicleModelYear
GROUP BY make ;

SELECT DISTINCT COUNT(*) OVER () AS total
FROM VehicleModelYear
GROUP BY make, model ;

Late additions by @Alexander Fedorenko:

SELECT TOP (1)
    SUM(COUNT(DISTINCT model)) OVER () AS total
FROM VehicleModelYear
GROUP BY make ;

SELECT TOP (1) COUNT(*) OVER () AS total
FROM VehicleModelYear
GROUP BY make, model ;

or:

; WITH cte AS
  ( SELECT DENSE_RANK() OVER(ORDER BY make, model) AS dr
    FROM VehicleModelYear
  )
SELECT MAX(dr) AS total
FROM cte ;
Community
  • 1
  • 1
ypercubeᵀᴹ
  • 113,259
  • 19
  • 174
  • 235