I am trying to find duplicates from a table based on multiple columns. I have a table which has a columns like Email(Office),Email(Personal1) & Email_Personal2, Mobile_Personal1, Mobile_Personal2, FirstName,MiddleName,LastName,CompanyName,Designation etc. It has millions of records. There are so many duplicates for a specific record.
The folowing image represents the sample Table:
Now, I want to find the record using sql query which has the full values in its columns, want to keep this record and delete remaining all.
select *,count(*) from mytable where first_name!="" group by First_Name,Email_Office,Email_Personal1,Email_Personal2,Personal_Mobile1,Personal_Mobile2 having count(*)>1
But its showing me a specific record with total number of occurrence in last count(*) column only. Guide me in above query how can I see a record which has all the details along with number of occurrences of that specific record? How do I keep that one complete record and delete remaining all from a table?
I've removed "having count(*) >1" to see each record from above query but its taking so much time to show the output,almost feel like its getting hanged.
select t.* from mytable t inner join (select first_name,middle_name,last_name,designation,company_name,email_office,email_personal1,email_personal2,personal_mobile1,personal_mobile2,count(*) as NoDuplicates from mytable group by first_name,middle_name,last_name,designation,company_name,email_office,email_personal1,email_personal2,personal_mobile1,personal_mobile2 having NoDuplicates > 1) tsum on t.first_name=tsum.first_name and t.Middle_Name=tsum.middle_name and t.Last_Name=tsum.last_name and t.Designation=tsum.designation and t.Company_Name=tsum.company_name and t.Email_Office=tsum.email_office and t.Email_Personal1=tsum.email_personal1 and t.Email_Personal2=tsum.email_personal2 and t.Personal_Mobile1=tsum.personal_mobile1 and t.Personal_Mobile2=tsum.personal_mobile2