4

In Oracle, it's possible to get a count of distinct values in multiple columns by using the || operator (according to this forum post, anyway):

SELECT COUNT(DISTINCT ColumnA || ColumnB) FROM MyTable

Is there a way to do this in SQL Server 2008? I'm trying to perform a single query to return some group statistics, but I can't seem to do it.

For example, here is a table of values I'm trying to query:

AssetId MyId    TheirId   InStock
328     10      10        1
328     20      20        0
328     30      30        0
328     40      10        0
328     10      10        0
328     10      10        0
328     10      10        0
328     10      10        0

For AssetId #328, I want to compute the total number of unique IDs in the MyId and TheirId columns (4 = 10, 20, 30, 40), as well as the total number of non-zero rows in the InStock column (1):

AssetId     TotalIds    AvailableIds
328         4           1

Is there a way to work this magic somehow?

Mass Dot Net
  • 2,150
  • 9
  • 38
  • 50
  • 1
    This question on SO may help: [counting distinct over multiple columns](http://stackoverflow.com/questions/1471250/counting-distinct-over-multiple-columns) – Alex P Aug 12 '12 at 07:20

6 Answers6

2

You can use a cross apply and values.

select T1.AssetId,
       count(distinct T2.ID) TotalIds,
       sum(case T2.InStock when 0 then 0 else 1 end) AvailableIds 
from YourTable as T1
  cross apply(values(T1.MyId, T1.InStock),
                    (T1.TheirId, 0)
             ) as T2(ID, InStock)
group by T1.AssetId  

SE-Data

Or you can do a union all in a sub query.

select T.AssetId,
       count(distinct T.ID) TotalIds,
       sum(case T.InStock when 0 then 0 else 1 end) AvailableIds 
from (
     select AssetId, MyId as ID, InStock
     from YourTable
     union all
     select AssetID, TheirId, 0
     from YourTable
     ) as T
group by T.AssetId  
Community
  • 1
  • 1
Mikael Eriksson
  • 136,425
  • 22
  • 210
  • 281
  • 1
    Wow. +1 on coolness but the cross apply could be a performance killer. – Lieven Keersmaekers Aug 12 '12 at 10:01
  • @Lieven It does not look so on the execution plan. It will only do one table scan and the cross apply does a constant scan. But the solutions provided should of course be tested with a realistic amount of data. It might be a different story with a lot of rows. – Mikael Eriksson Aug 12 '12 at 10:04
  • Sorry for the redonk delay in marking your answer as accepted, and thank you for the workable solution. I was hoping for an elegant native solution, but in the absence of that I'll settle for a working one... cheers! – Mass Dot Net May 02 '13 at 02:08
2

You can follow the Oracle example and concatenate the values together (that is what the Oracle query is doing). You just have to convert the values to characters first:

select AssetId,
       count(distinct cast(MyId as varchar(8000))+','+cast(TheirId as varchar(8000)
            ) totalIds,
       count(distinct case when inStock> 0
                           then cast(MyId as varchar(8000))+','+cast(TheirId as varchar(8000)
             end) as AvailableIds
from t
group by Assetid

You can also do it as a subquery:

select AssetId, count(*) as TotalIds,
       sum(case when inStock > 0 then 1 else 0 end) as AvailableIds
from (select AssetId, myId, theirId, max(inStock) as inStock
      from t
      group by AssetId, myId, theirId
     ) a
group by AssetId

"Theoretically", I like the second approach better, since it is more set-based. However, if you find yourself trying to count distinct combinations columns in several different variables, the string concatenation approach is more practical.

Gordon Linoff
  • 1,242,037
  • 58
  • 646
  • 786
1

You can get the result like this:

DECLARE @t TABLE (AssetId INT, MyId    INT, TheirId   INT, InStock INT)
INSERT @t
VALUES 
(328,10, 10,   1)
,(328,20, 20,   0)
,(328,30, 30,   0)
,(328,40, 10,   0)
,(328,10, 10,   0)
,(328,10, 10,   0)
,(328,10, 10,   0)
,(328,10, 10,   0)

;WITH a AS(
    SELECT  AssetId,
            COUNT(col) cnt
    FROM 
    (
            SELECT  MyId col, AssetId
            FROM    @t
            UNION
            SELECT  TheirId col, AssetId
            FROM    @t
    ) b
    GROUP BY AssetId
)

SELECT  a.AssetId,
        a.cnt TotalIds,
        SUM(CASE WHEN InStock <> 0 THEN 1 ELSE 0 END) AvailableIds
FROM    @t c
JOIN    a ON a.AssetId = c.AssetId
GROUP   BY a.AssetId, a.cnt

In Common Table Expression (WITH code block) 'uniqueness' is guaranteed by using UNION operator which discards duplicate values, that's why COUNT(col) doesn't need to be used like COUNT(DISTINCT col).

Ivan Golović
  • 8,732
  • 3
  • 25
  • 31
  • 1
    Very similar to answer I was going to post! +1 – XN16 Aug 12 '12 at 08:27
  • I was hoping to avoid using CTEs, since it kills the performance of my query, increasing it by over 100% when processing large datasets. This is the solution I already have, but I was hoping there was some clever way of doing it, like with Oracle's "||" operator. – Mass Dot Net Aug 12 '12 at 09:08
1

I think it's good solution for you

SELECT COUNT(*)
FROM (SELECT DISTINCT Column1, Column2 
      FROM MyTable) A
David Weinberg
  • 1,033
  • 1
  • 13
  • 29
0

Option #1

select 
          AssetID, count(distinct MyId) As MyId, SUM(InStock) InStock
From T
Group By 
      AssetID

Option #2 Without CTE

Select AssetID, count(MyId), sum(InStock) InStock From
(
    select 
            AssetID, MyId, SUM(InStock) InStock
    From MyTable
    Group By 
            AssetID, MyId
)K
Group by AssetID

Option #3 With CTE

;With Sub(AssetID, MyId, InStock)
As
(
    select 
            AssetID, MyId, SUM(InStock) InStock
    From MyTable
    Group By 
            AssetID, MyId
)

Select AssetID, count(MyId), sum(InStock) From
(
    Select * from Sub
)K
Pankaj
  • 9,749
  • 32
  • 139
  • 283
0

If you don't like to use CTE's, you can try using following solution. The gist of it is to

  • select the TotalID's for each AssetID in a seperate subquery
  • select the AvailableIDs for each AssetID in a seperate subquery
  • JOIN the results of both subqueries to produce the final results.

The statement as is works on the entire table. You can get the results for a single AssetID by adding an appropriate where clause to the entire group.

SQL Statement

SELECT  a.AssetId, t.TotalIDs, a.AvailableIDs
FROM    (
            SELECT  AssetID, TotalIDs = COUNT(*) 
            FROM    (
                        SELECT  AssetID 
                        FROM    MyTable 
                        GROUP BY 
                                MyId, TheirID, AssetID
                    ) t 
            GROUP BY 
                    AssetID
        ) AS t
        INNER JOIN (
            SELECT  AssetID, AvailableIDs = SUM(InStock) 
            FROM    MyTable 
            GROUP BY 
                    AssetID
        ) AS a ON a.AssetId = t.AssetId

Test script

;WITH MyTable (AssetId, MyId, TheirId, InStock) AS (
    SELECT * FROM (VALUES 
        (328, 10, 10, 1)
        , (328, 20, 20, 0)
        , (328, 30, 30, 0)
        , (328, 40, 10, 0)
        , (328, 10, 10, 0)
        , (328, 10, 10, 0)
        , (328, 10, 10, 0)
        , (328, 10, 10, 0)
        , (329, 10, 10, 0)
        , (329, 10, 20, 1)
    ) AS a (b, c, d, e)
)
SELECT  a.AssetId, t.TotalIDs, a.AvailableIDs
FROM    (
            SELECT  AssetID, TotalIDs = COUNT(*) 
            FROM    (
                        SELECT  AssetID 
                        FROM    MyTable 
                        GROUP BY 
                                MyId, TheirID, AssetID
                    ) t 
            GROUP BY 
                    AssetID
        ) AS t
        INNER JOIN (
            SELECT  AssetID, AvailableIDs = SUM(InStock) 
            FROM    MyTable 
            GROUP BY 
                    AssetID
        ) AS a ON a.AssetId = t.AssetId
Lieven Keersmaekers
  • 57,207
  • 13
  • 112
  • 146