3

Using SQL Server, I'm not a DBA but I can write some general SQL. Been pulling my hair out for about an hour now. Searching I've found several solutions but they all fail due to how GROUP BY works.

I have a table with two columns that I'm trying to check for duplicates:

  • userid
  • orderdate

I'm looking for rows that have BOTH userid and orderdate as duplicates. I want to display these rows.

If I use group by, I can't pull any other data, such as the order ID, because it's not in the group by clause.

Mureinik
  • 297,002
  • 52
  • 306
  • 350

4 Answers4

4

You could use the grouped query in a subquery:

SELECT *
FROM   mytable a
WHERE  EXISTS (SELECT   userid, orderdate
               FROM     mytable b
               WHERE    a.userid = b.userid AND a.orderdate = b.orderdate
               GROUP BY userid, orderdate
               HAVING   COUNT(*) > 1)
Mureinik
  • 297,002
  • 52
  • 306
  • 350
  • 1
    How would this `HAVING COUNT(*) = 0` give what OP wants? – SS_DBA Jul 25 '17 at 16:49
  • 2
    @WEI_DBA I literally don't know what I was thinking... Edited and fixed – Mureinik Jul 25 '17 at 16:58
  • Mureinik you're my hero. This works perfectly. I've been able to modify it to add the info I need and bingo. Thank you - seriously. –  Jul 25 '17 at 18:04
1

You can also use a windowed function:

; With CTE as 
    (Select *
    , count(*) over (partition by UserID, OrderDate) as DupRows
    from MyTable)

Select *
from CTE
where DupRows > 1
order by UserID, OrderDate
APH
  • 4,109
  • 1
  • 25
  • 36
0

You can get the duplicates by using the groupby and having. Like so:

SELECT
    userid,orderdate, COUNT(*)
FROM
    yourTable
GROUP BY
    userid,orderdate
HAVING 
    COUNT(*) > 1

EDIT:

SELECT * FROM yourTable
WHERE CONCAT(userid,orderdate) IN
(
    SELECT
        CONCAT(userid,orderdate)
    FROM
        yourTable
    GROUP BY
        userid,orderdate
    HAVING 
        COUNT(*) > 1
)
Icculus018
  • 1,018
  • 11
  • 19
  • 1
    Yes, this will give you the duplicates, but they also want all of the columns in the result set. Good start though. – SS_DBA Jul 25 '17 at 17:04
  • Yeah thanks for this :) This is where I started but couldn't get the rest of the columns as @WEI_DBA mentioned –  Jul 25 '17 at 19:01
0
SELECT * 
FROM myTable 
WHERE CAST(userid as Varchar) + '/' + CONVERT(varchar(10),orderdate,103) In 
(
    SELECT 
        CAST(userid as Varchar) + '/' + CONVERT(varchar(10),orderdate,103)
    FROM myTable
    GROUP BY userid , orderdate
    HAVING COUNT(*) > 1
);
Jonathan
  • 6,507
  • 5
  • 37
  • 47
  • While technically, this will get you the correct answer, the concatenation will kill performance if the data set is large. You would want to evaluate the two fields independently. – Eli Jul 25 '17 at 17:22
  • Thanks for your help ! –  Aug 09 '17 at 21:49