0

I have a big table. I am finding duplicates, this table has 10 columns(lets say A,B,C,D..etc). I am running below query to get duplicate record but its taking more than 1 hour. Query :

SELECT COUNT(*) FROM TABLE1 GROUP BY A,B,C,D,E HAVING COUNT(*) >1

Can someone suggest me faster way to get this done.

Ankit Bajpai
  • 13,128
  • 4
  • 25
  • 40
Deepak nigam
  • 99
  • 1
  • 15
  • You need all column which is having different values? – Unknown_Coder Sep 20 '17 at 10:02
  • how many rows in the table exactly? And what hardware? Lots of things can impact on performance, not just the query. Anyway, if this is a one-off operation, does it matter if it takes a little while? However i#f you're doing this regularly to remove duplicates, ask yourself why - shouldn't you be changing the primary key on the table so the duplicates can't be created in the first place? – ADyson Sep 20 '17 at 10:06
  • If you just trying to remove the duplicate rows after finding them, Refer this - https://stackoverflow.com/questions/3311903/remove-duplicate-rows-in-mysql – Ankit Bajpai Sep 20 '17 at 10:06
  • My requirement is get if column A,B,C,D,E can make a unique index on this table or not. This table has already a number of records. So, I am finding duplicates in existing records. – Deepak nigam Sep 20 '17 at 10:08
  • yes you've already said that, but that doesn't answer my question about whether this is a one-off job or something you want to repeat regularly. If it's just to be run once then maybe speed doesn't matter so much? – ADyson Sep 20 '17 at 12:07

1 Answers1

0

Check Performance by applying this query!

SELECT count(*) FROM tbl_name mt
JOIN
(SELECT A,B,C,D,E FROM tbl_name GROUP BY A,B,C,D,E HAVING COUNT(*) >1) T 
ON mt.A= T.A
and mt.B= T.B
and mt.C= T.C
and mt.D= T.D
and mt.E= T.E
Unknown_Coder
  • 764
  • 6
  • 24