-1

I have a table which has student ID and student email address, and I want to delete duplicate email address, I am using MySQL and wondering any good solutions. For any duplicate email address, keep either one is fine.

Sample input,

ID1 s1@example.com
ID2 s2@example.com
ID3 s1@example.com

Sample output,

ID1 s1@example.com
ID2 s2@example.com

thanks in advance, Lin

Lin Ma
  • 9,739
  • 32
  • 105
  • 175
  • 1
    you should use validation when inserting email in database... if the email address already exits throw error that email already exists, – Rahman Qaiser Sep 04 '15 at 06:39
  • @RahmanQaiser, thanks for the advice, and duplicate already there. Any advice is appreciated. :) – Lin Ma Sep 04 '15 at 06:41
  • 1
    @LinMa are those ids are integer ? – Abhik Chakraborty Sep 04 '15 at 06:42
  • 1
    try SELECT id, table.email FROM table INNER JOIN (SELECT email FROM table GROUP BY email HAVING count(email) > 1) dup ON table.email = dup.email – Rahman Qaiser Sep 04 '15 at 06:42
  • @AbhikChakraborty, yes, ID is integer. Any advice is appreciated. :) – Lin Ma Sep 04 '15 at 06:43
  • @RahmanQaiser, thanks for the solution, and I know how to select unique emails and just not sure how to delete duplicate. – Lin Ma Sep 04 '15 at 06:44
  • 3
    [http://stackoverflow.com/questions/5935797/delete-duplicate-email-addresses-from-table-in-mysql](Delete Duplicate email addresses from Table in MYSQL) – Rahman Qaiser Sep 04 '15 at 06:44
  • @RahmanQaiser, thanks for the reference, the sample you referred to delete all records which has duplicate email, but I want to keep one. :) – Lin Ma Sep 04 '15 at 06:46
  • @JakubKania, your solution works for me and could you write an answer so that I can mark it as answered to benefit future people? Thanks. – Lin Ma Sep 04 '15 at 17:55

1 Answers1

3

You can delete the data using the join and keep the new or old id as per your wish and here how its done

delete t1 from table_name t1
join table_name t2 on t1.email = t2.email and t1.id > t2.id

This will keep the older id for the duplicate value http://sqlfiddle.com/#!9/9d2925/1

If you want the new id then change t1.id > t2.id to t1.id < t2.id

Abhik Chakraborty
  • 44,654
  • 6
  • 52
  • 63
  • just wondering how SQL engine executed underlying for the query. Not sure for all join group, inside each group if SQL engine enumerate all pairs and remove smaller id in a pair wise way, if so, the complexity will be large. – Lin Ma Sep 04 '15 at 18:21