4

I would be grateful if you could help me to solve this sql problem.

I have a table that contains two columns. Name them UserID and LocationID in this example.

I would like to clean this table, so I would like to keep the first occurences of each unique pairs of UserIDs and LocationIDs and delete additional rows that contains the same pairs. In other words I would like to know every location for each user where he has been to. If he visited the same place again and again and it was logged, I would like to delete those records. For example, this is the original table:

ID UserID  LocationID
1. "user1" "location1"
2. "user1" "location2"
3. "user2" "location3"
4. "user1" "location1"
5. "user2" "location3"
6. "user2" "location3"

I would like to remove the 4., 5. and 6. rows.

Is it possible to solve it in one mysql command? Thanks.

  • How do you know what is "first occurrence" just because it's first in your output doesn't necessarily mean it was the first entered into the database... 2nd where are the numbers coming from? are they in the table too with some sort of column name or there for illustration purposes only? – xQbert Feb 21 '13 at 20:00
  • Does your table have an `AUTO_INCREMENT` field, like a `rowID` or something? – gen_Eric Feb 21 '13 at 20:01
  • Yes, the numbers are the ids of records. Auto-incremented INT. By "first" occurence I mean the lowest id. –  Feb 21 '13 at 20:03
  • after removing them, define [UserID,LocationID] as primary key to avoid same issue in the future. – Aris Feb 21 '13 at 20:05
  • 1
    @Aris: Or rather a `UNIQUE KEY`, the `PRIMARY KEY` should be the `ID` field. – gen_Eric Feb 21 '13 at 20:06
  • Create a `UNIQUE KEY` on `UserID` – Kermit Feb 21 '13 at 20:07

2 Answers2

5

Delete using a self-join:

DELETE t2.*
FROM tablename AS t1
INNER JOIN tablename AS t2
    ON t1.UserID = t2.UserID
    AND t1.LocationID = t2.LocationID
WHERE t1.ID < t2.ID

tablename should be replace with the name of your table.

Oswald
  • 31,254
  • 3
  • 43
  • 68
1

To add on to Oswald's answer, to prevent duplicates in the future you can use INSERT...ON DUPLICATE UPDATE:

INSERT INTO table (a,b,c) VALUES (1,2,3)
ON DUPLICATE KEY UPDATE c=c+1;

UPDATE table SET c=c+1 WHERE a=1;
user3871
  • 12,432
  • 33
  • 128
  • 268