82

I have a MySQL table with utf8 general ci collation. In the table, I can see two entries:

abad
abád

I am using a query that looks like this:

SELECT *  FROM `words` WHERE `word` = 'abád'

The query result gives both words:

abad
abád

Is there a way to indicate that I only want MySQL to find the accented word? I want the query to only return

abád

I have also tried this query:

SELECT *  FROM `words` WHERE BINARY `word` = 'abád'

It gives me no results. Thank you for the help.

Chris
  • 746
  • 4
  • 10
  • 15

10 Answers10

106

If your searches on that field are always going to be accent-sensitive, then declare the collation of the field as utf8_bin (that'll compare for equality the utf8-encoded bytes) or use a language specific collation that distinguish between the accented and un-accented characters.

col_name varchar(10) collate utf8_bin

If searches are normally accent-insensitive, but you want to make an exception for this search, try;

WHERE col_name = 'abád' collate utf8_bin

Update for MySQL 8.0, plus addressing some of the Comments and other Answers:

  • The CHARACTER SET matches the beginning of the COLLATION.
  • Any COLLATION name ending in _bin will ignore both upper/lower case and accents. Examples: latin1_bin, utf8mb4_bin.
  • Any COLLATION name containing _as_ will ignore accents, but do case folding or not based on _ci vs _cs.
  • To see the collations available (on any version), do SHOW COLLATION;.
  • utf8mb4 is now the default charset. You should be using that instead of utf8.
  • It is better to have the CHARACTER SET and COLLATION set 'properly' on each column (or defaulted by the table definition) than to dynamically use any conversion routine such as CONVERT().
Rick James
  • 135,179
  • 13
  • 127
  • 222
17

In my version (MySql 5.0), there is not available any utf8 charset collate for case insensitive, accent sensitive searches. The only accent sensitive collate for utf8 is utf8_bin. However it is also case sensitive.

My work around has been to use something like this:

SELECT * FROM `words` WHERE LOWER(column) = LOWER('aBád') COLLATE utf8_bin
David
  • 2,942
  • 33
  • 16
  • 4
    `#1253 - COLLATION 'utf8_bin' is not valid for CHARACTER SET 'utf8mb4' ` – User Apr 10 '16 at 19:49
  • 5
    Then you should use utf8mb4_bin. – David Apr 11 '16 at 10:33
  • 1
    if you want to make it both-side-case-insensitive, don't forget to also `LOWER('the_searched_value')` so you will get `LOWER(column) = LOWER('aBád')` – jave.web Apr 05 '17 at 17:53
  • I also had to use utf8mb4_bin... this was for an "inline query" in phpMyAdmin. Wonder where and why these CHARACTER SETs are set...? – mike rodent May 15 '17 at 16:03
6

Accepted answer is good, but beware that you may have to use COLLATE utf8mb4_bin instead!

WHERE col_name = 'abád' collate utf8mb4_bin

Above fixes errors like:

MySQL said: Documentation 1253 - COLLATION 'utf8_bin' is not valid for CHARACTER SET 'utf8mb4'

Robert Sinclair
  • 4,550
  • 2
  • 44
  • 46
3

The MySQL bug, for future reference, is http://bugs.mysql.com/bug.php?id=19567.

colan
  • 2,818
  • 2
  • 20
  • 17
2

Check to see if the database table collation type end with "_ci", This stands for case insensitive...

Change it to collation the same or nearest name without the "_ci" ...

For example... change "utf8_general_ci" to "utf8_bin" Mke

HoldOffHunger
  • 18,769
  • 10
  • 104
  • 133
Mike
  • 99
  • 1
  • 3
1

I was getting the same error.

I've changed the collation of my table to utf8_bin (through phpMyAdmin) and the problem was solved.

Dharman
  • 30,962
  • 25
  • 85
  • 135
Silvio Delgado
  • 6,798
  • 3
  • 18
  • 22
0

Well, you just described what utf8_general_ci collation is all about (a, á, à, â, ä, å all equals to a in comparison).

There have also been changes in MySQL server 5.1 in regards to utf8_general_ci and utf8_unicode_ci so it's server version dependent too. Better check the docs.

So, If it's MySQL server 5.0 I'd go for utf8_unicode_ci instead of utf8_general_ci which is obviously wrong for your use-case.

0

That works for me for an accent insensitive and case insensitive search in MySql server 5.1 in a database in utf8_general_ci, where column is a LONGBLOB.

select * from words where '%word%' LIKE column collate utf8_unicode_ci

with

select * from words where'%word%' LIKE column collate utf8_general_ci

the result is case sensitive but not accent sensitive.

0
SELECT *  FROM `words` WHERE column = 'abád' collate latin1_General_CS 

(or your collation including cs)

cjk
  • 45,739
  • 9
  • 81
  • 112
0

You can try searching for the hex variable of the character, HEX() within mysql and use a similar function within your programming language and match these. This worked well for me when i was doing a listing where a person could select the first letter of a person.

Ólafur Waage
  • 68,817
  • 22
  • 142
  • 198