16

I have a MySQL table (test) with utf-8 charset encoding. There are three entries, two entries with normal characters and another name with accent characters.

CREATE TABLE test (
  id Integer,
  name VARCHAR(50), 
  PRIMARY KEY (id)
) ENGINE=InnoDB AUTO_INCREMENT=1 DEFAULT CHARSET=utf8;

INSERT INTO `test` (`id`, `name`) VALUES (1, 'aaaa');
INSERT INTO `test` (`id`, `name`) VALUES (2, 'AAAA');
INSERT INTO `test` (`id`, `name`) VALUES (3, 'áááá');

If I run the following select query, it returns all the 3 entries

Actual Result:-

select * from test where name like '%aa%';

id  | name
----|----
1   | aaaa
2   | AAAA
3   | áááá

Instead of that, it should be return last entry with id=3.

I don't want to use 'BINARY' OR 'COLLATE utf8_bin' because it returns only case sensitive search.

I need normal search with string like query, e.g:-

Expected Result:-

select * from test where name like '%aa%';

id | name
---|-----
1  | aaaa
2  | AAAA
SST
  • 2,054
  • 5
  • 35
  • 65
  • Take it simply. You can remove accents from column before search (write your function) e.g. `where removeaccent(name) LIKE '%aa%'` –  May 02 '16 at 10:39
  • Getting "ERROR 1305 (42000): FUNCTION DB.removeaccent does not exist" where do I write the function? – SST May 02 '16 at 11:10
  • Any inputs here really appreciated – SST May 03 '16 at 07:11
  • You should *write* your own function e.g. `CREATE FUNCTION CustomerLevel(name VARCHAR(100)) RETURNS VARCHAR(100)` http://www.mysqltutorial.org/mysql-stored-function/ –  May 03 '16 at 10:44
  • @SST - Problem solved, check my answer – Fathah Rehman P May 04 '16 at 12:19
  • It should be worth noting that MySQL *`UTF8_`* character set is ***NOT*** the full UTF8 set and so will not display all characters correctly. [Read about it here](http://stackoverflow.com/questions/279170/utf-8-all-the-way-through) – Martin May 04 '16 at 12:21

5 Answers5

7

The utf8_bin collation is what you need for your requirement to handle accents

I don't want to use 'BINARY' OR 'COLLATE utf8_bin' because it returns only case sensitive search.

This is easier (and more performant) to solve with utf8_bin than solving the accent issue with another collation

SELECT * FROM test WHERE LOWER(name) like '%aa%' COLLATE utf8_bin

-> added after comments

The query above assumes that the query parameters are minuscule but if you cant modify the params to always be minuscules then you can also use this variation

SELECT * FROM test WHERE LOWER(name) like LOWER('%ÚÙ%') COLLATE utf8_bin
user3802077
  • 849
  • 9
  • 22
  • Thanks for your reply. Its works fine when I query a normal string. If I search '%ÚÙ%', it returns empty. why its getting failure?? (| 4 | ÛÚÙëý | | 5 | uuëý |) – SST May 05 '16 at 12:29
  • Sorry I did not specify, because you want case insensitivity we use LOWER to lowercase the name but the search term needs to be in lowercase too. – user3802077 May 05 '16 at 12:46
  • Is there any possibilities to search such strings with case insensitivity? – SST May 05 '16 at 12:50
  • 1
    SELECT * FROM test WHERE LOWER(name) like LOWER('%ÚÙ%') COLLATE utf8_bin – user3802077 May 05 '16 at 17:51
4

utf8_bin is the collation you want to distinguish accented characters.

In the query you can user lower to make the query case insensitive.

CREATE TABLE `token` (
  `id` int(11) NOT NULL DEFAULT '0',
  `name` varchar(50) CHARACTER SET utf8 COLLATE utf8_bin DEFAULT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=utf8

mysql> select * from token where lower(name) like '%aa%';
+----+------+
| id | name |
+----+------+
|  1 | aaaa |
|  2 | AAAA |
+----+------+
2 rows in set (0.00 sec)
Károly Nagy
  • 1,734
  • 10
  • 13
1

You can solve your problem using following query

  select * from token where (convert(name using ASCII)) like '%aa%'

convert is used to convert between character sets

Fathah Rehman P
  • 8,401
  • 4
  • 40
  • 42
  • In this query, If i have a mixed characters name like "ÛÚÙëý" and 'uuuëý', the above query returns empty set [mysql> select * from test where name = convert(name using ASCII) and name like '%uu%'; Empty set (0.00 sec)]. It should return the row which name is "uuuëý" – SST May 04 '16 at 12:25
  • @SST -answer modified. Check now – Fathah Rehman P May 04 '16 at 12:32
  • Thank you @Fathan!! When I execute this query [select * from test where (convert(name using ASCII)) like '%ÚÙ%';], It throws an error.. ERROR 1267 (HY000): Illegal mix of collations (ascii_general_ci,IMPLICIT) and (utf8_general_ci,COERCIBLE) for operation 'like' – SST May 05 '16 at 12:40
  • It should return any one of the entry for both queries [1. select * from test where (convert(name using ASCII)) like '%uu%'; Answer: | 5 | uuuëý | 2. select * from test where (convert(name using ASCII)) like '%ÚÙ%'; Expected: | 4 | ÛÚÙëý | – SST May 05 '16 at 12:45
1

Using RLIKE (REGEXP) could solve your problem (it will return your expected result by using a more powerfull version of like)

from MYSQL-Documentation:
A regular expression is a powerful way of specifying a pattern for a complex search.
.... REGEXP is not case sensitive, except when used with binary strings.

just replace

where name like '%aa%'

with

where Name rlike 'aa';

to do a case insensitive search for the expression 'aa'.

BUT :
This can be a somehow unsafe approach as unexpected results can be produced by comparing multi-byte characters according to MySQL Documentation.

iLikeMySql
  • 736
  • 3
  • 7
1

You can try with:

SELECT * FROM test.test
where convert(name using ascii) like '%aa%';

But be careful, convert has performance issues on indexes. More information at http://dev.mysql.com/doc/refman/5.7/en/mysql-indexes.html

corretge
  • 1,751
  • 11
  • 24