0

The data is displayed correctly but when doing searches it doesn't find anything. This is how data is stored in a table with UTF8 encoding.

Default charset: utf8mb4
names: utf8
character_set_client : utf8

enter image description here

SELECT * FROM article WHERE description like '%några%' //returns null but it should return one row
SELECT * FROM article WHERE description like '%nå%' //works
SELECT * FROM article WHERE description like '%någ%' //returns null

I think mysql converts å to a.

I have tried to convert the search query to utf8 using php function utf8_encode($str). But no success here. How can I solve this issue?

  • Have you tried checking other functions as well e.g. instr(description,'%några%' ) > 0 – Krishna Jan 22 '14 at 08:10
  • Tried it now. NO success.However it works if I search for %nÃ¥gra% –  Jan 22 '14 at 08:17
  • possible duplicate of [UTF-8 all the way through](http://stackoverflow.com/questions/279170/utf-8-all-the-way-through) – deceze Jan 22 '14 at 08:18
  • 2
    "nÃ¥gra" stored in the database and "några" are obviously not the same words. You have an encoding problem when inserting into the database, storing your data as garbage. – deceze Jan 22 '14 at 08:18
  • So the screenshot I posted is not in utf8? –  Jan 22 '14 at 08:28
  • 1
    @Error404 It is UTF8 misinterpreted as something else, probably Latin1. – glglgl Jan 22 '14 at 08:29
  • "några" encoded in UTF-8 will display as "några" when interpreted as UTF-8. "några" encoded in Latin-1 will display as "några" when interpreted as Latin1. A string misinterpreted or misconverted at some point will look like garbage. – deceze Jan 22 '14 at 08:31
  • @deceze copy your last comment and post it as an answer. I truncated all tables and set the connection to utf8. The new data is readable in the database. and therefore the search result works as expected. Thanks for the help. –  Jan 22 '14 at 08:33

1 Answers1

0

"nÃ¥gra" stored in the database and "några" are obviously not the same words. You have an encoding problem when inserting into the database, storing your data as garbage. – deceze 15 mins ago

"några" encoded in UTF-8 will display as "några" when interpreted as UTF-8. "några" encoded in Latin-1 will display as "några" when interpreted as Latin1. A string misinterpreted or misconverted at some point will look like garbage. – deceze 2 mins ago

duplicate of UTF-8 all the way through – deceze 16 mins ago

Community
  • 1
  • 1
deceze
  • 510,633
  • 85
  • 743
  • 889