0

folks. I'm new at MySQL programming and I've tried all things to manage this.

I would like to do a insensitive search with/without accents, lowercases or uppercases and all return the same results.

SELECT * FROM table WHERE campo_pesquisado LIKE '%termo_pesquisado%' ORDER BY campo_pesquisado ASC

So, in my DB (MyISAM - collation utf8_general_ci) I have this

+---------+--------+
| campo_pesquisado |
+---------+--------+
|   São Paulo     |
|   SÃO JOÃO     |

I would like to type termo_pesquisado (keywords) = São Paulo, Sao Paulo, SÃO PAULO or any combination of 'São Paulo' to get the return of São Paulo (that in browser shows correctly - São Paulo) from the database.

THE PROBLEM

If I type "são paulo, SãO PAULO or any combination with the "ã" lowercase works. It's because the UTF-8 respective code for ã is ã. If I search for SÃO PAULO, the à letter become à the full word will be SÃO PAULO that is clearly not equal to São Paulo.

TRYING

To solve this I tried this code bellow, but is not working for me.

 SELECT *, CONVERT(CAST( campo_pesquisado AS BINARY) USING utf8) FROM table WHERE CONVERT(CAST( campo_pesquisado AS BINARY) USING utf8) LIKE '%termo_pesquisado%' ORDER BY campo_pesquisado ASC

IMPORTANT

I can't change my collation. We have to use utf8 as char encode for the tables. Its better for multilanguage purposes.

I'm using PHP (5.5) and the last version of MySQL.

  • Isn't the utf8 character for `ã`, `ã`? Note that `collation` and `character set` are different separate things. It sounds like you aren't in utf8 somewhere. – chris85 Nov 23 '15 at 05:15
  • I am not sure but can you try this : SELECT * FROM table WHERE BINARY campo_pesquisado = 'termo_pesquisado' ORDER BY campo_pesquisado ASC – krish Nov 23 '15 at 05:17
  • @chris85 I will try to follow the possible answer in **UTF-8 all the way through** – Renato Pirei Nov 23 '15 at 05:22
  • @chris85 - If you follow this [link](http://site112.com/converter-latin-utf) you will see that the character for ã is ã (8859-1 to UTF-8). – Renato Pirei Nov 23 '15 at 05:25
  • @krish I tried but not worked. :( – Renato Pirei Nov 23 '15 at 05:27
  • @chris85 The **ã** is ISO-8859-1, when we use the [utf8_encode php function](http://php.net/manual/pt_BR/function.utf8-encode.php) the **ã** turns in **ã** – Renato Pirei Nov 23 '15 at 05:34
  • Where are you in ISO, why not in UTF8? http://www.fileformat.info/info/unicode/char/e3/index.htm – chris85 Nov 23 '15 at 05:39
  • @chris85 our main language is Portuguese. The browser translate automatically the DB characters from UTF8 to ISO. When the user types something he types with an keyboard that has **ã** (keys with accents). The codes are converted when the user send the info (Eg.: forms, inputs ...) – Renato Pirei Nov 23 '15 at 05:45
  • Look through that utf8 thread, you should have utf8 for all your pages; from the user input to the database write. – chris85 Nov 23 '15 at 13:28
  • Search for Mojibake. – Rick James Dec 11 '15 at 02:26

0 Answers0