0

I copy and pasted an emoticon from microsoft word :) which became and inserted it into a mysql table where the table and field has charset utf8mb4 collation utf8mb4_unicode_ci and field type longblob. the emoticon was inserted into the table as emoticon. but when I try to display it in my html page, it became this weird character 😊. I tried using htmlentities, htmlspecialchars, htmlspecialchars_decode but none of them can display the emoticon properly.

dapidmini
  • 1,490
  • 2
  • 23
  • 46
  • LONGBLOB columns don't have a character set or collation property. Do you mean LONGTEXT? – Bill Karwin Aug 12 '22 at 16:39
  • 2
    Here's some important reading on enabling utf8 content in a web presentation: https://stackoverflow.com/questions/279170/utf-8-all-the-way-through – Bill Karwin Aug 12 '22 at 16:44
  • originally the field was longtext, but I read in SO that to be able to store emoticons, the field needs to be blob type. and since the contents can be quite long, I changed it into longblob instead. before changing it into longblob I tried the other suggestions that was to change the charset and collation so I just mentioned it here just in case – dapidmini Aug 12 '22 at 16:44
  • That advice was incorrect. You can store utf8 in a `CHAR(1)`. Using BINARY or BLOB or its sibling types is not going to work, because they store binary bytes, with no associated character set. – Bill Karwin Aug 12 '22 at 16:46
  • well I did try applying the charset and collation before but the emoticon became `????` in the mysql table.. the emoticon was only inserted properly after I changed the field type into longblob. – dapidmini Aug 12 '22 at 16:49
  • `😊` is how [Windows-1252](https://en.wikipedia.org/wiki/Windows-1252) interprets the byte sequence F0 9F 98 8A, which is the UTF-8 encoding of (U+1F60A). – dan04 Aug 12 '22 at 22:29

2 Answers2

0

it turns out the htmlspecialchars_decode didn't work because I also used loadHTML so I just need to change the code into this:

$str = htmlspecialchars_decode($row['body']);

$doc = new DOMDocument();
// @$doc->loadHTML($str); // emoticons not shown properly
@$doc->loadHTML('<?xml encoding="utf-8" ?>' . $str); // emoticons is shown properly
... // some other processes
return $doc->saveHTML();
dapidmini
  • 1,490
  • 2
  • 23
  • 46
0

The data should be inserted into a TEXT column with CHARACTER SET utf8mb4, not a BLOB.

The connection parameters should include utf8mb4 and/or UTF-8, but not utf8. (This various with the product reading the data.

Do not use any encoders/decoders in PHP; that only adds to the confusion.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • I changed the database, field, and connection using utf8mb4 but when I try to insert the emoticon, it became `????` in the table.. did I do something wrong? for the connection I changed it in the `system/database/DB_driver.php` file – dapidmini Aug 13 '22 at 08:30
  • @dapidmini - See "question marks" in https://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored – Rick James Aug 13 '22 at 15:51