Reduce mysql database to elminiate duplicates

Question

I have a table called "lane" with the following properties.

CREATE TABLE `lane` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `origRegion` varchar(45) NOT NULL,
  `origState` char(2) NOT NULL,
  `destRegion` varchar(45) NOT NULL,
  `destState` char(2) NOT NULL
  PRIMARY KEY (`id`)
)

There are duplicate rows in this table of the following columns: origState, origRegion, destState, destRegion. I'd like to be able to select all rows JOINED to what the min(id) is of the first occurance.

For example, with data:

1 ALL MA ALL OH
2 ALL MA ALL OH
3 ALL MA ALL OH

and a SQL similar to this (which misses all the duplicate rows):

select l.*, l2.count, l2.minId from tmpLane l 
JOIN (SELECT id, min(ID) as minId from tmpLane 
  GROUP BY origRegion, origState, destRegion, destState) l2 on l.id = l2.id;

Result (note the count and minId at the end):

1 ALL MA ALL OH 3 1
2 ALL MA ALL OH 3 1
3 ALL MA ALL OH 3 1

Note, that the query used above is an adaptation of the solution here (which doesn't work in this situation)

Is the last result you show what you want to get or is it the wrong result? Please show what is the expected result — Roberto, Dec 21 '15 at 14:15
The result shown is what I expect. What I'm actually getting is just 1 row. A delete query won't work. — Domenic D., Dec 21 '15 at 15:28

score 1 · Answer 1 · answered Dec 21 '15 at 14:24

1

You can run this query to remove all the duplicate rows from your database:-

ALTER IGNORE TABLE `lane`
ADD UNIQUE INDEX (`origRegion`, `origState`, `destRegion`, `destState`);

This will add unique index to your table and remove all dulicate rows and will make sure that no duplicate rows being inserted in future.

answered Dec 21 '15 at 14:24

Ankit Bajpai

13,128
4
25
40

1

The issue is that the OP wants the COUNT of the duplicate rows displayed for each row, not just to remove or ignore them completely. – DiscipleMichael Dec 21 '15 at 14:44
The count is helpful just to see how many there are. I do not want to delete the rows. I only want to select which have duplicates and propose a single instance (min) that could be used. – Domenic D. Dec 21 '15 at 15:32

score 1 · Accepted Answer · edited Dec 21 '20 at 17:40

1

SELECT ID, 
       origRegion, 
       origState, 
       destRegion, 
       destState,
       (SELECT COUNT(*)
        FROM Lane l3
        WHERE l.origRegion = l3.origRegion
        and l.origState = l3.origState
        and l.destRegion = l3.destRegion
        and l.destState = l3.destState) as 'Count', 
       (SELEcT MIN(ID) 
        FROM Lane l2 
        WHERE l.origRegion = l2.origRegion
        and l.origState = l2.origState
        and l.destRegion = l2.destRegion
        and l.destState = l2.destState) as minID
FROM lane l

edited Dec 21 '20 at 17:40

TylerH

20,799
66
75
101

answered Dec 21 '15 at 15:14

DiscipleMichael

510
2
17

I don't JUST want the first occurrence, I want all occurences with the added column providing a single unique instance (min(id)). – Domenic D. Dec 21 '15 at 15:31

Reduce mysql database to elminiate duplicates

2 Answers2