1

I'm developing a website in Brazilian Portuguese and I'm facing some really annoying encoding problems.

Words that should be written this way: óbito are being written this way: �bito

I have noticed that while the texts are still at the database they are ok. But when I use echo with PHP, the encoding problem comes up.

List of things I have already done and did not help:

1- Set the PHP files to be saved as UTF-8

2- I'm using this meta tag <meta http-equiv="content-type" content="text/html; charset=utf-8" />

3- I used this SQL Queries:

CREATE DATABASE db_name
    CHARACTER SET utf8
    DEFAULT CHARACTER SET utf8
    COLLATE utf8_general_ci
    DEFAULT COLLATE utf8_general_ci
    ;

ALTER DATABASE db_name
    CHARACTER SET utf8
    DEFAULT CHARACTER SET utf8
    COLLATE utf8_general_ci
    DEFAULT COLLATE utf8_general_ci
    ;

ALTER TABLE tbl_name
    DEFAULT CHARACTER SET utf8
    COLLATE utf8_general_ci
    ;
marc_s
  • 732,580
  • 175
  • 1,330
  • 1,459
rlc
  • 5,809
  • 5
  • 38
  • 46
  • Is the browser encoding automatically set to UTF-8 when you load the page? – quantumSoup Jul 17 '10 at 21:50
  • If not, does it display the text correctly when you change your browser's encoding to UTF-8? – quantumSoup Jul 17 '10 at 21:50
  • Yes. The browser automatically set to UTF-8 when I load the page. – rlc Jul 17 '10 at 21:58
  • 1
    @Aircule: The presence of `�` is a good sign that it’s already internpreted as UTF-8. Because that character is used to denote corrupt byte sequences. – Gumbo Jul 17 '10 at 21:59
  • @Rafael Carvalho: I guess your data in your data base is already corrupted. You could try some hexadecimal dump like `SELECT HEX(columnname) FROM …` to see the hexadecimal representation. – Gumbo Jul 17 '10 at 22:00
  • 1
    What if you change your browser's encoding to ISO-8859-1 (Latin 1 aka Western)? – quantumSoup Jul 17 '10 at 22:00
  • 1
    @Gumbo the results are 5265636F6D70656E7361206D65726563696461 , 4156414C4F4E2048494748 and stuff like that. @Aircule When I change to ISO-8895-1 there's no problem about these `�` . But this problem comes up: http://stackoverflow.com/questions/3242762/i-enconding-issue – rlc Jul 17 '10 at 22:05
  • @Rafael Carvalho: For what value is that the dump? Please select a short and easy example value or just take a sub-string. – Gumbo Jul 17 '10 at 22:13
  • 1
    Those results don't tell us anything, because they are from strings that just have regular ASCII chars (where ISO-8859-1 is encoded the same as UTF-8). Try a dump on string that have characters with accents (ie: óbito) – quantumSoup Jul 17 '10 at 22:15
  • @Gumbo "Recompensa merecida" and "AVALON HIGH" respectively. – quantumSoup Jul 17 '10 at 22:17
  • @Gumbo, @Aircule. Sorry for that. result: 417465737461646F20646520C3B36269746F <--> Atestado de óbito other example: 43726570C3BA7363756C6F <--> Crepúsculo – rlc Jul 17 '10 at 22:17
  • Well, that is valid UTF-8 (ie: it's not corrupt in the database). Maybe you are doing some string manipulation in PHP after retrieving it from the DB? – quantumSoup Jul 17 '10 at 22:28
  • 1
    @Rafael Carvalho: Well, the data seems to be correct in this case. Then I guess your session/connection settings are wrong. Try to execute these statements to see the current charset/collation settings: `SHOW VARIABLES LIKE 'character_set%'; SHOW VARIABLES LIKE 'collation%';` – Gumbo Jul 17 '10 at 22:33
  • @Gumbo Probably it's that: collation_connection utf8_unicode_ci collation_database utf8_general_ci collation_server latin1_swedish_ci How do I solve this? – rlc Jul 17 '10 at 22:43
  • See "question mark" and "black diamond" in https://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored – Rick James Jan 05 '19 at 23:15

6 Answers6

5

Dont try to reinvent the wheel, keep things simple: just use the following line after selecting the database:

mysql_query("SET NAMES 'utf8'") OR die(mysql_error()); 
Shahid Karimi
  • 4,096
  • 17
  • 62
  • 104
4

You can change the charset using this function:

$q = mysql_set_charset('utf8');
var_dump($q);

(if the function returns true it's been successful). This should fix your problems for the connection.

For older versions of PHP, you can use the following:

<?php
 if (function_exists('mysql_set_charset') === false) {
     /**
      * Sets the client character set.
      *
      * Note: This function requires MySQL 5.0.7 or later.
      *
      * @see http://www.php.net/mysql-set-charset
      * @param string $charset A valid character set name
      * @param resource $link_identifier The MySQL connection
      * @return TRUE on success or FALSE on failure
      */
     function mysql_set_charset($charset, $link_identifier = null)
     {
         if ($link_identifier == null) {
             return mysql_query('SET NAMES "'.$charset.'"');
         } else {
             return mysql_query('SET NAMES "'.$charset.'"', $link_identifier);
         }
     }
 }
 ?>

It seems like PHP uses latin1 by default and I can't find a way to change the default. So I guess you'll have to use mysql_set_charset() every time you start a new connection.

Boa Sorte.

quantumSoup
  • 27,197
  • 9
  • 43
  • 57
1

Mine is solved by putting:

<meta http-equiv="content-type" content="text/html; charset=utf-8" />

in the header of the page, besides creating the tables again and changing the collation of database to utf-8 encoding.

the Tin Man
  • 158,662
  • 42
  • 215
  • 303
0

Use this for Portuguese :

<meta http-equiv="Content-Type" content="text/html; charset=pt-BR" />
Harshal
  • 3,562
  • 9
  • 36
  • 65
0
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />

Putting above meta tag inside <head> tag of your html/php document.

And you also need to run this query mysql_query("SET NAMES 'utf8'"); in your main functions file if you are using MySQL database with PHP.

Ali Nawaz
  • 1,006
  • 10
  • 11
0

Is solved by Putting: After the database connection string

$con = mysqli_connect('localhost', 'root', 'password', 'testdb');
if (mysqli_connect_errno()) {
    exit("Failed to connect to MySQL: " . mysqli_connect_error());
}
mysqli_set_charset($con, 'utf8');