0

My Question

How can I store or convert emojis and french accents and other special non-ascii characters in a MySQL database?



The Problem

Within the wp_posts table of my MySQL database called local, some characters in the content of the post_content column are emojis or french accent characters like ë.

When I view the field content in Adminer, the emojis are not rendered properly but the special french accent characters are:

Content with emojis Content with accent chars
enter image description here v6.1.5+5536



What I've Tried

UPDATE wp_posts SET post_content = CONVERT(CAST(CONVERT(post_content USING latin1) AS BINARY) USING utf8)

^This SQL command converts most special characters but it totally nullifies (deletes all content) for the rows that have emojis or french characters.

UPDATE wp_posts SET post_content = CAST(CAST(CONVERT(post_content USING latin1) AS binary) AS CHAR)

^This SQL command successfully converts all emojis when I only run it on rows that include emojis:

Before SQL Command After SQL Command
enter image description here v6.1.5+5536

however, when I try to run this on the whole table, I get the following error:

Error in query (1366): Incorrect string value: '\xEBlle B...' for column 'post_content'

The command trips up once it gets to a french accent character. The \xEBlle is the ë from Noëlle.



My Setup

Local v6.1.5+5536
Adminer v4.8.1
MySQL v8.0.16
PHP v7.3.5
Database (local) Collation utf8mb4_general_ci
Table (wp_posts) Collation utf8mb4_general_ci
Column (post_content) Collation utf8mb4_general_ci
Column (post_content) Type longtext
  • it would be best that you create a fiddle wiith the charcterset that you have and the one you want with data in it so that we have a [mre] – nbk Nov 17 '21 at 20:53
  • Specify the `utf8mb4` character set on all tables and text columns in your database, see [UTF-8 all the way through](https://stackoverflow.com/questions/279170/) – JosefZ Nov 17 '21 at 20:59
  • @nbk SQL Fiddle and DB Fiddle do not allow me to set `utf8mb4_general_ci` as the table or column Collation even though that is what's set in Adminer. – TerminalVelocity Nov 18 '21 at 13:22
  • @josefZ, there is no option for plain `utf8mb4`, only options like `utf8mb4_general_ci`, `utf8mb4_bin`, `utf8mb4_unicode_ci` and more language based (cs, da, ru, ...). – TerminalVelocity Nov 18 '21 at 13:24
  • and you have utf8mb4 and what is the other charcter set or collation? – nbk Nov 18 '21 at 13:29
  • @Nbk, as mentioned in the top post, all Collations (DB, Table, and Column) are `utf8mb4_general_ci` and there is no option for just `utf8mb4`. – TerminalVelocity Nov 18 '21 at 13:36
  • See "Mojibake" in https://stackoverflow.com/questions/38363566/trouble-with-utf8-characters-what-i-see-is-not-what-i-stored – Rick James Nov 20 '21 at 01:21

0 Answers0