4

I want to make a smart count operation, so that if the data in columns are the same, then it will be counted as 1.

My table is:

dbo.Messages 
(
    FromUserId INT,
    ToUserId INT
)

Data:

INSERT dbo.Messages VALUES(1, 5), (2, 20), (5, 1), (1, 5);

The count should return 2 because (1,5) and (5,1) is the same in my algorithm.

How can I write it in SQL Server TSQL?

Thanks in advance.

Misha Zaslavsky
  • 8,414
  • 11
  • 70
  • 116

4 Answers4

2

One way to go about this is to group by the least and greatest of the FromUserId and ToUserId, using the distinct values from your original table. Since SQL Server, unlike MySQL, does not have a LEAST and GREATEST function, we can use CASE expressions instead.

SELECT CASE WHEN t.FromUserId < t.ToUserId THEN t.FromUserId ELSE t.ToUserId END,
       CASE WHEN t.FromUserId < t.ToUserId THEN t.ToUserId   ELSE t.FromUserId END,
       COUNT(*) AS duplicateCount
FROM
(
    SELECT DISTINCT FromUserId, ToUserId
    FROM dbo.Messages
) t
GROUP BY CASE WHEN t.FromUserId < t.ToUserId THEN t.FromUserId ELSE t.ToUserId END,
         CASE WHEN t.FromUserId < t.ToUserId THEN t.ToUserId   ELSE t.FromUserId END
Tim Biegeleisen
  • 502,043
  • 27
  • 286
  • 360
2

This works quite well:

CREATE TABLE #Messages 
(
    FromUserId INT,
    ToUserId INT
);

INSERT #Messages VALUES(1, 5), (2, 20), (5, 1), (1, 5);

SELECT COUNT(*)
FROM (
  SELECT M1.FromUserId, M1.ToUserId
  FROM #Messages AS M1
  EXCEPT
  SELECT M2.ToUserId, M2.FromUserId
  FROM #Messages AS M2
  WHERE M2.ToUserId > M2.FromUserId
  ) AS T;

Derived table with EXCEPT will remove your duplicates and then it just counts so called unique values. Keep in mind that here's no need for DISTINCT keyword, EXCEPT removes all dupes.

Results from derived table:

FromUserId ToUserId 
---------- -------- 
1          5        
2          20   

You can check how this query works here: https://data.stackexchange.com/stackoverflow/query/524634/counting-unique-values

Evaldas Buinauskas
  • 13,739
  • 11
  • 55
  • 107
1

On SQL Server 2008 and later this should work:

SELECT distinct
    (SELECT Min(v) FROM (VALUES (FromUserId), (ToUserId)) AS value(v)) as UserIdMin,
    (SELECT Max(v) FROM (VALUES (FromUserId), (ToUserId)) AS value(v)) as UserIdMax
FROM dbo.Messages

Cred to: SQL MAX of multiple columns?

Community
  • 1
  • 1
Daniel Stackenland
  • 3,149
  • 1
  • 19
  • 22
  • 1
    This is great answer. To add something up, you don't need two seperate subqueries and could use `CROSS APPLY` to return both `MIN` and `MAX` values. Here's an example: https://data.stackexchange.com/stackoverflow/query/524750/counting-unique-values – Evaldas Buinauskas Aug 15 '16 at 14:29
0

Demo here

select distinct  t1.*
from
#temp t1
join
#temp t2
on t1.FromUserId=t2.ToUserId
and t1.ToUserId=t2.FromUserId
Community
  • 1
  • 1
TheGameiswar
  • 27,855
  • 8
  • 56
  • 94