0

Based on http://www.i18nqa.com/debug/utf8-debug.html I want to perform a search in my MySQL table to see if I have rows that have encoding problems.

If I run the following query :

select t.col1 from table t where t.col1 like '%Ú%' 

it will bring all the t.col1 that have 'as' characters in them.

How can I change the query to make it fetch only the rows containing '%Ú%', and not all that contain '%as%'.

Marius
  • 11
  • 1
  • 2

2 Answers2

1

try this if you are using collation latin1_swedish_ci

select t.col1 from table t where t.col1 regexp '^[Ú]';
Hassan Farooq
  • 71
  • 2
  • 12
  • Note that `REGEXP` is rather lame -- it does not understand character sets, only bytes. So Hassan'd use of it happens to work. – Rick James May 09 '16 at 22:15
0

With MySQL's collations, case-folding and accent-stripping go together.

If you want neither, use the ..._bin collation for the character set you are using.

WHERE foo LIKE '%Ú%' COLLATE utf8_bin

Even faster would be to declare foo to be COLLATE utf8_bin instead of whatever you have. (Note: the default for utf8 is utf8_general_ci.)

Rick James
  • 135,179
  • 13
  • 127
  • 222