1

In my table I have the following columns:

id|name|gender

I would like to extract all the names which are duplicated, but I don't want to scan all the db to count each record one by one...I just want to know who has the same name... Example:

0|mary|F
1|tom|M
2|peter|M
3|mary|F
4|chris|M
5|chris|F
6|alex|M

I just want to get: mary and chris... But I don't want to read all the records at once... any ideas? Thank you.

TheBoyan
  • 6,802
  • 3
  • 45
  • 61
Tattat
  • 15,548
  • 33
  • 87
  • 138

3 Answers3

7
SELECT
   name
FROM
   Mytable
GROUP BY
   name
HAVING
   COUNT(*) > 1

To separate gender too, change the GROUP BY to name, gender

To delete duplicate rows

DELETE MyTable
WHERE ID NOT IN
   (
    SELECT
       MIN(Id)
    FROM
       Mytable
    GROUP BY
       name
   )
gbn
  • 422,506
  • 82
  • 585
  • 676
2

Is this what you're looking for?

select name from tbl
group by name
having count(*) > 1
bniwredyc
  • 8,649
  • 1
  • 39
  • 52
1

As per my understanding what you mean from your statement that you do not want to read all the records at once is that you want the query for finding duplicate names in chunks.

If my understanding is correct, then you can run the query in batches using id as the separator for that ie. first run the query for 1000 records, then run the query for records between 1000 and 2000, then between 2000 and 3000.

But there is one issue when you do not want to read all the records at once : It might happen that the name is unique in the chunk you have picked up but still it can be a duplicate in the whole table i.e. a name can be unique between records 1000 and 2000 but can occur between records 4000 and 5000.

Deepansh Gupta
  • 593
  • 4
  • 9