0

So I'm tasked with cleaning up a system that has generated redundant orders.

Data example of the problem

ORDER ID, SERIAL, ...
1         1
2         1
3         2
4         2
5         3
6         3
7         3

The above data shows that 2 orders were generated with serial 1, 2 orders with serial 2, and 3 orders with serial 3. This is not allowed, and there should be only one order per serial.

So I need a query that can identify the REDUNDANT orders ONLY. I'd like the query to exclude the original order.

So the output from the above data should be:

REDUNDANT ORDER IDS
2
4
6
7

I can easily identify which orders have duplicates using GROUP BY and HAVING COUNT(*) > 1 but the tricky part comes with removing the original.

Is it even possible?

Any help is greatly appreciated.

DarkNeuron
  • 7,981
  • 2
  • 43
  • 48
  • As Thomas hints at, not really sure how I can use the linked question; it seems somewhat different that my needs. – DarkNeuron May 11 '17 at 10:19
  • @ThomasG Your mention of MIN gave me an idea, and I think I'll be able to solve it using a self join. I can only give you an upvote, so I'll do that. – DarkNeuron May 11 '17 at 10:28
  • @ThomasG At it's most basic, it's hardly a stretch: `DELETE FROM my_table WHERE id NOT IN (...)` – Strawberry May 11 '17 at 10:45
  • 1
    @DarkNeuron your answer is here : http://sqlfiddle.com/#!9/9a918/3 – Thomas G May 11 '17 at 10:56

1 Answers1

1

As posted in the comments, here's one way to achieve this:

SELECT T1.ORDER_ID as redundant
FROM thetable T1
LEFT JOIN
(
  SELECT SERIAL, MIN(ORDER_ID) AS firstorder
  FROM thetable
  GROUP BY SERIAL
  HAVING COUNT(*) > 1
) T2 ON T1.ORDER_ID=T2.firstorder
WHERE T2.firstorder IS NULL

SQL Fiddle

Thomas G
  • 9,886
  • 7
  • 28
  • 41