In searching words containing German umlauts(ä,ö,ü) I get search results only if I use 'o' in place of 'ö'

Question

I have a web application which uses mysql.I am using mysql 5.5 and my charset is UTF-8 and collation is utf8_general_ci.I have an issue where if I search for term like Jöhn in my web application I don't get any result,although there is an entry in database as Jöhn.The string is stored correctly in the database.I get the result if I search with 'o' instead of 'ö' (i.e)if I give search term as 'john' I get 'jöhn' in search result.I don't have any other entry like john.The only entry is using the german umlaut i.e jöhn.

Possible duplicate of [What's the difference between utf8\_general\_ci and utf8\_unicode\_ci](http://stackoverflow.com/questions/766809/whats-the-difference-between-utf8-general-ci-and-utf8-unicode-ci) — Holger Just, Sep 02 '16 at 09:46

score 1 · Answer 1 · answered Sep 02 '16 at 23:51

Let me walk through the possible situations...

The single character ö (hex C3B6) is usually treated as the same as o in most collations, including utf8_general_ci. You can see this in

SELECT 'o' = 'ö' COLLATE utf8_general_ci;  --> 1 (meaning True)

See utf8 collations . 8 other collations (bin, danish, icelandic, swedish, estonian, turkish, hungarian, and german2) treat them differently.

There is at least one way to write ö as two characters: UNHEX('CC886F') is a 'combining diaeressis' plus 'o'.

SELECT UNHEX('CC886F') = 'o' COLLATE utf8_unicode_ci; --> 1
SELECT UNHEX('CC886F') = 'o' COLLATE utf8_general_ci; --> 0

Back to the question... If the ö is really two characters, then you do need utf8_unicode_ci. If you can get the HEX, check against the hex I listed above.

Probably the only way for "if I search for term like Jöhn in my web application I don't get any result,although there is an entry in database as Jöhn" to happen is the two renderings of the character.

(So I agree with Robin, but I hope this illuminates the "why".)

Even after changin to utf8_unicode_ci,the issue is not resolved. — S Shruthi, Sep 06 '16 at 13:00

score 0 · Answer 2 · answered Sep 02 '16 at 10:09

0

Use collation utf8_unicode_ci. This should solve the problem.

answered Sep 02 '16 at 10:09

rbr94

2,227
3
23
39

In searching words containing German umlauts(ä,ö,ü) I get search results only if I use 'o' in place of 'ö'

2 Answers2