4

I have a table like this:

name
Smith
Smith
Perez
Pérez

I would like to eliminate duplicates like Smith but preserve both Perez and Pérez (e and é). If I use 'group by' I get two rows (Smith and one of the two Perez/Pérez), but I would like to get three rows (Smith, Perez, Pérez). It happens the same with Sjögren and Sjogren, etc. Thanks

FP Towers
  • 43
  • 8

2 Answers2

1

1)First check your table if it has utf8 charset encoding with

select table_name,engine 
from information_schema.tables
where table_schema = 'your_database'

2)Secondly , if it is not than (else skip to 3rd step), ALTER your table (utf8 character set encoding, so it will support special character)

ALTER TABLE `name` CHARACTER SET utf8;

3) SELECT from your db with utf8 charset

select * from your_table group by name collate utf8_general_ci
Dimag Kharab
  • 4,439
  • 1
  • 24
  • 45
1

Try using utf8_unicode_ci rather than utf8_general_ci - it uses a more accurate comparison algorithm.

Edper
  • 9,144
  • 1
  • 27
  • 46
Paul J
  • 476
  • 2
  • 8
  • I tried using utf8_unicode_ci and still not differentiating between e and é. Then I tried using utf8mb4_unicode and I got the same result: Perez and Pérez are still considered the same. – FP Towers Mar 13 '14 at 16:43