0

Previous issue - was not able to store non-english characters:

How to store non-english characters?

That was fixed by using UTF8. But realized today that symbols like ♥☆ are not stored correctly. They get converted to characters like ♥☆.

How can this be fixed?

Community
  • 1
  • 1
Yeti
  • 5,628
  • 9
  • 45
  • 71

2 Answers2

12

It looks to me like they're being stored correctly, but that you're not interpreting them correctly when you read them out. and are going to end up as multibyte characters in UTF-8 encoding. I'll bet if you look up that multibyte encoding, you'll see it's the same as the single-byte encoding for ♥ and ☆ respectively.

Edit: adding details.

As you can see in the following table, interpreting the UTF-8 characters as if they were encoded as Windows Latin-1 gives the results you're seeing.

UTF-8 character      Hex
♥                    e2 99 a5
☆                    e2 98 86

Windows Latin-1      Hex
â                    e2
™                    99
¥                    a5
˜                    98
†                    86
Carl Norum
  • 219,201
  • 40
  • 422
  • 469
  • 2
    It's not to do with PHP. In your HTML, you have to give a charset of UTF-8. – Amy B Jun 10 '10 at 16:44
  • An old, very broken browser is also possible. It'd help if we had an actual HTML page generated by this application to look at. – Nicholas Knight Jun 10 '10 at 16:50
  • 1
    @Nicholas: Even IE6 can do UTF-8. If you're using a browser which can't do UTF-8, it barely qualifies as a browser these days. – Michael Madsen Jun 10 '10 at 16:52
  • @Michael: I'm glad you've not had the recent misfortune of dealing with browsers in real-world settings even older and more broken than IE6. Not all of us have been so lucky. – Nicholas Knight Jun 10 '10 at 17:31
  • @Nicholas: I'm guessing it must be corporate policies that caused this misfortune. – Yeti Jun 10 '10 at 17:42
  • @Nicholas... but those people expect the web to be broken. They're used to it. Just like people with 800x600 browsers expect to need to scroll. Of course, if you're getting paid by those people it's a different matter. (☆_☆) – Armstrongest Jun 10 '10 at 17:43
2

Is UTF8 used consistently across the whole spectrum (MySQL, PHP, Apache, <meta>s, headers..)?

For me this worked out of the box:

$query = "update tbl set col = '♥☆' where id = 1";
mysql_query($query) or die(mysql_error());
$query = "select col from tbl where id = 1";
$res = mysql_query($query) or die(mysql_error());
print_r(mysql_fetch_row($res));

Debug output:

Content-type: text/html; charset=utf-8
Array
(
    [0] => ♥☆
)
Lauri Lehtinen
  • 10,647
  • 2
  • 31
  • 29