I'm having trouble finding duplicates in a database table with the following setup:
==========================================================================
| stock_id | product_id | store_id | stock_qty | updated_at |
==========================================================================
| 9990 | 51 | 1 | 13 | 2014-10-25 16:30:01 |
| 9991 | 90 | 2 | 5 | 2014-10-25 16:30:01 |
| 9992 | 161 | 1 | 3 | 2014-10-25 16:30:01 |
| 9993 | 254 | 1 | 18 | 2014-10-25 16:30:01 |
| 9994 | 284 | 2 | 12 | 2014-10-25 16:30:01 |
| 9995 | 51 | 1 | 11 | 2014-10-25 17:30:02 |
| 9996 | 90 | 2 | 5 | 2014-10-25 17:30:02 |
| 9997 | 161 | 1 | 3 | 2014-10-25 17:30:02 |
| 9998 | 254 | 1 | 16 | 2014-10-25 17:30:02 |
| 9999 | 284 | 2 | 12 | 2014-10-25 17:30:02 |
==========================================================================
Stock updates are imported into this table every hour, I'm trying to find duplicate stock entries (any rows which have a matching product id and store id) so I can delete the oldest. The query below is my attempt, by comparing product ids and store ids on a join like this I can find one set of duplicates:
SELECT s.`stock_id`, s.`product_id`, s.`store_id`, s.`stock_qty`, s.`updated_at`
FROM `stock` s
INNER JOIN `stock` j ON s.`product_id`=j.`product_id` AND s.`store_id`=j.`store_id`
GROUP BY `stock_id`
HAVING COUNT(*) > 1
ORDER BY s.updated_at DESC, s.product_id ASC, s.store_id ASC, s.stock_id ASC;
While this query will work, it doesn't find ALL duplicates, only 1 set, which means if an import goes awry and isn't noticed until the morning, there's a possibility that we'll be left with tons of duplicate stock entries. My MySQL skills are sadly lacking and I'm at a complete loss about how to find and delete all duplicates in a fast, reliable manner.
Any help or ideas are welcome. Thanks