0

I've implemented an emoji picker for comments on my Spring & Thymeleaf web app/blog.

Currently, I can select an emoji, see it appear in the textarea, submit the form, the comment is saved inside the controller post method into my MySQL 5.7.17 db table - I can see the emoji art in the table column- the comment returns via ajax, and I can see the emoji on the page. Yay, woohoo!

But! After I reload the page... I see this:

"ð± and ð¶"

What gives??

In order to insert the emoji's in mysql, I followed this tutorial:

https://mathiasbynens.be/notes/mysql-utf8mb4

Storing is NOT the problem.

My my.cnf file, located at

/usr/local/Cellar/mysql/5.7.17/support-files/my.cnf

my.cnf:

--defaults-extra-file=#
[client]
default-character-set = utf8mb4

[mysqld]
init-connect='SET NAMES utf8mb4'
character-set-client-handshake = FALSE
character-set-server = utf8mb4
collation-server = utf8mb4_unicode_ci

[mysql]
default-character-set = utf8mb4

and then made this query:

ALTER TABLE comments CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

and this:

mysql> SET NAMES 'utf8mb4';
Query OK, 0 rows affected (0.00 sec) [then I put: init-connect='SET NAMES utf8mb4' in the cnf file]

mysql> SHOW VARIABLES WHERE Variable_name LIKE 'character\_set\_%' OR 
Variable_name LIKE 'collation%';
+--------------------------+--------------------+
| Variable_name            | Value              |
+--------------------------+--------------------+
| character_set_client     | utf8mb4            |
| character_set_connection | utf8mb4            |
| character_set_database   | utf8               |
| character_set_filesystem | binary             |
| character_set_results    | utf8mb4            |
| character_set_server     | utf8               |
| character_set_system     | utf8               |
| collation_connection     | utf8mb4_general_ci |
| collation_database       | utf8_general_ci    |
| collation_server         | utf8_general_ci    |
+--------------------------+--------------------+
10 rows in set (0.00 sec)

^However, from what I understand, this only works once^

because when I run that command after I Run the app, it reads:

+--------------------------+--------------------+
| Variable_name            | Value              |
+--------------------------+--------------------+
| character_set_client     | utf8               |
| character_set_connection | utf8               |
| character_set_database   | utf8mb4            |
| character_set_filesystem | binary             |
| character_set_results    | utf8               |
| character_set_server     | utf8               |
| character_set_system     | utf8               |
| collation_connection     | utf8_general_ci    |
| collation_database       | utf8mb4_unicode_ci |
| collation_server         | utf8_general_ci    |
+--------------------------+--------------------+
10 rows in set (0.03 sec)

My pom.xml has this:

<properties>
    <project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
    <project.reporting.outputEncoding>UTF 8</project.reporting.outputEncoding>
    <java.version>1.8</java.version>

    <property name="hibernate.connection.CharSet" value="utf8mb4" />
    <property name="hibernate.connection.characterEncoding" 
    value="utf8mb4"/>
    <property name="hibernate.connection.useUnicode" value="true"/>
</properties>

and on all relevant HTML pages and on the header fragment I have:

<meta charset="UTF-8">

When I System.out.println(comment.getBody()) in the controller's PostMapping method -both before and after I save the comment- I can see the emojis in the terminal just fine! But when I System.out.println(comment.getBody()) in the GetMapping for the page, I see all the weird characters and not the emoji. I'm really confused. What do you think the issue can be and what should I do to resolve it? Any help is appreciated, thank you in advance!

(From Comment:)

CREATE TABLE `comments` (
    `id` bigint(20) NOT NULL AUTO_INCREMENT, 
    `body` blob NOT NULL, 
    `created_date` datetime DEFAULT NULL, 
    `parent_id` bigint(20) DEFAULT NULL, 
    `post_id` bigint(20) DEFAULT NULL, 
    `user_id` bigint(20) DEFAULT NULL,
) ENGINE=InnoDB AUTO_INCREMENT=2084 DEFAULT CHARSET=utf8 
Rick James
  • 135,179
  • 13
  • 127
  • 222
RyanH
  • 25
  • 7

1 Answers1

0

That looks like 'Mojibake'; see Trouble with UTF-8 characters; what I see is not what I stored

But, since ð is hex F0, and F0 is the start of Emoji (etc), it may be that you have specified utf8 in MySQL instead of utf8mb4. What was "ð± and ð¶" supposed to be??

Spring/Hibernate:

Hibernate XML:

<property name="hibernate.connection.CharSet">utf8mb4</property>
<property name="hibernate.connection.characterEncoding">utf8</property>
<property name="hibernate.connection.useUnicode">true</property>

Connection url: db.url=jdbc:mysql://localhost:3306/db_nameuseUnicode=true&character_set_server=utf8mb4

CREATE TABLE

DEFAULT CHARSET=utf8 says that all VARCHAR and TEXT columns will be CHARSET utf8 unless overridden.

body blob NOT NULL, -- You aren't even using a text-like datatype! BLOB says "just throw the bytes in; don't even think about CHARSET".

Because of BLOB, if the Emoji is going into the body, the bytes should be coming out identical to the way they went in. But, let's check something else. Please get HEX(body), preferably for a very short body, perhaps with nothing but an Emoji in it.

For example, the hex for --

F09F9881 -- correctly in utf8mb4 (aka "UTF-8" outside MySQL).  Note leading F0
C3B0C5B8CB9CC281 -- "Double encoded".  Might display as 😠 Note leading ETH (ð)
Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Hey Rick, thanks for commenting! The emojis were supposed to be a cat and dog but I tested again with the classic smiley. ```mysql> SELECT body, HEX(body) FROM comments where id = 2077; | body | HEX(body) | +------+-----------+ | | F09F9880 | 1 row in set (0.00 sec)``` Is what I'm getting in Terminal. I placed what you wrote in my XML file and connection url in my application.properties but I still can't see it on the page, ugh. I'm one of those plagued. Should I change the my.cnf file at all? Thanks for your help sorry I'm new to this. – RyanH Nov 15 '18 at 19:14
  • @RyanH - Something, somewhere, is changing things to "utf8" instead of "utf8mb4". The "connection url" is the likely place. Or there could be a `SET NAMES utf8`. Keep digging around. – Rick James Nov 15 '18 at 19:40
  • Ok will do, thanks. I noticed this: ```mysql> SELECT character_set_name FROM information_schema.`COLUMNS` WHERE table_name = "comments" AND column_name = "body"; +--------------------+ | character_set_name | | NULL | 1 row in set (0.00 sec)``` could this be an issue? – RyanH Nov 15 '18 at 20:45
  • @RyanH - Please provide `SHOW CREATE TABLE comments` – Rick James Nov 15 '18 at 20:47
  • Ok: ```| comments | CREATE TABLE `comments` ( `id` bigint(20) NOT NULL AUTO_INCREMENT, `body` blob NOT NULL, `created_date` datetime DEFAULT NULL, `parent_id` bigint(20) DEFAULT NULL, `post_id` bigint(20) DEFAULT NULL, `user_id` bigint(20) DEFAULT NULL, ) ENGINE=InnoDB AUTO_INCREMENT=2084 DEFAULT CHARSET=utf8 |``` Hm, I see that the default character set is utf-8... – RyanH Nov 15 '18 at 21:03
  • So I altered the database and table: ```ALTER DATABASE spring_blog_db CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;``` ```ALTER TABLE comments CONVERT TO CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;``` Then ran the show create query again and now DEFAULT CHARSET=utf8mb4 COLLATE=utf8mb4_unicode_ci But, just like earlier, the emoji shows up on the ajax response, then goes back to "mojibake" when I reload the page :/ – RyanH Nov 15 '18 at 21:44
  • Those had no effect since there are no character columns in `comments` table. – Rick James Nov 15 '18 at 21:59