I have a table called emaildata consisting of 4 columns emailaddress, domainname, data and id.
The emailaddress column should contain only unique entries, but there are many duplicates. The domainname and data column are not unique, and as such will contain duplicates which is fine. The id column is set to autoincrement so will contain only unique values.
My question is how do I get rid of all rows that feature duplicate email addresses, keeping the one with the lowest id?
There should be around 370,000 rows, but currently I've got 906,000.
I had an SQL statement before which I used for a similar table and I've tried to adapt it to this without success.
delete T1
from emaildata T1, emaildata T2
where T1.emailaddress = T2.emailaddress
and T1.id > T2.id
The above was based upon the following which was applied to another table and worked fine.
delete T1
from email_list_subscribers T1, email_list_subscribers T2
where T1.emailaddress = T2.emailaddress
and T1.subscriberid > T2.subscriberid
I've tried running this against my table on the remote server in phpmyadmin and after pressing the GO button, the loading bar comes up in the middle, then disappears as if it is processing - but it never does.
I've tried repeating this against the same table running on my home server (XAMPP) via phpmyadmin, and again with HeidiSQL - the same problem with phpmyadmin and Heidi appears to crash.
I've tried other soloutions that i've seen on here but I seem to be getting the same "timeout" / crash problem. I never had issues with the original statement running on the remote server, granted this was against a database a third of the size.
Any info would be appreciated.