There's a few things that can cause your encoding to break like this. Unicode in itself isn't a character encoding, though, I'm guessing you're talking about UTF-8 encoding (as this is what you tried with your HTML header?).
So, it's important that your entire pipeline of code has the same charset. If it's different, you may get unexpected results, like you are now. A general idea of what you need to think of is
- HTML header
- PHP header
- MySQL connection
- Databases/tables
- Your actual
.php
file
You already set your HTML header, with
<meta http-equiv="content-type" content="text/html; charset=utf-8" />
But that doesn't account for the data PHP sends, so you also need
<?php
header('Content-Type: text/html; charset=utf-8');
And this line should be placed before any kind of output (put it at the top of your file).
Next, we need to check the connection. You're using the old, outdated mysql_
API, so then you'd need
mysql_set_charset("utf8");
after creating your MySQL connection (typical would be after mysql_select_db()
).
As for your databases and tables, you need to know that collation is not the same as charset. However, both should be UTF-8, and you can set it with running these queries in SQL
ALTER DATABASE databasename CHARACTER SET utf8 COLLATE utf8_unicode_ci;
ALTER TABLE tablename CONVERT TO CHARACTER SET utf8 COLLATE utf8_unicode_ci;
Lastly, it might also be needed for the file itself to be UTF-8 encoded. If you're using Notepad++ to write your code, this can be done in the "Format" drop-down on the taskbar. You should use UTF-8 w/o BOM
(see What's different between UTF-8 and UTF-8 without BOM?).
Note:
Any data already stored in the database will not necessarily have this encoding, so if there are broken characters in your database, and you go through with all these steps, your content might need to be re-uploaded to the database with your proper encoding (just submit the content again). There are methods to do this in batches if you have a lot of data. Although I've never used it myself, I've heard people succesfully using Force UTF8, but like I said, I've never used it myself.
Footnotes:
Your code is using a very outdated API, and I highly recommend you look into either MySQLi or PDO. Should you do so, make sure to use prepared statements on anything that deals with variables.
mysql_*
functions are deprecated since PHP 5.5 (and removed entirely in PHP 7) and you should stop using them if you can. You should choose another API that allows you to use prepared statements (which you really should when dealing with variables), like mysqli_*
or PDO
- see choosing an API.
You could also have a look at