3

in a mysql insert request i used ($text being sent by $_POST and column MESSAGE being of JSON format) :

$data = '{"mytext":".'$text'."}';

INSERT INTO xxxxx (MESSAGE) VALUES('$mytext');

i was stuck with the following problem :

Some smiley are displaying ok, but some with a "?"

I searched here for an answer and found that utf8mb4 must be set on the database, tables and columns.

The problem is when i try the following which contains my registered message:

ALTER TABLE xxxxx CHANGE column_name column_name JSON CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;

Result of the above

You have an error in your SQL syntax;

I assume that it is only possible on Varchar() column.

Do you have any idea on how i can make this work ?

Thank you

Nayantara Jeyaraj
  • 2,624
  • 7
  • 34
  • 63
Bloob
  • 41
  • 1
  • 4
  • 1
    You're open to SQL injection!! This is a huge security risk! Don't use your code in a real life application!! – cramopy Apr 04 '18 at 09:16
  • 1
    What is the full error message, it must point to some specific part that it deems incorrect ("near ..."). – deceze Apr 04 '18 at 10:10
  • this is not th actual production code, i have a secure_input() function which real escape and strip. – Bloob Apr 04 '18 at 10:27
  • The exact error message is : ERROR 1064 (42000): You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near 'CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci' at line 1 > that's why i assume that converting a json column to utf8mb4 is not relevant – Bloob Apr 04 '18 at 10:28
  • It has nothing to do with the contents of the table, it means that `... column_name JSON CHARACTER SET ...` is an invalid query. – deceze Apr 04 '18 at 10:55
  • @deceze When i do column_name varchar(xx) it doesn't throw any error, so i assume that is the problem. – Bloob Apr 04 '18 at 13:53
  • What was the table definition _before_ converting to JSON? – Rick James Jan 29 '20 at 22:56

1 Answers1

2

Some smiley are displaying ok, but some with a "?"

Start by finding the problems that cause "question mark" from Trouble with UTF-8 characters; what I see is not what I stored

In particular:

  • The bytes to be stored are not encoded as utf8mb4. Fix this.
  • The column in the database is not CHARACTER SET utf8mb4. Fix this. (Use SHOW CREATE TABLE to see what it is currently.)
  • Also, check that the connection during reading is UTF-8.

To discuss further, please provide

  • the SHOW CREATE TABLE before attempting the ALTER. (I suspect the VARCHAR is currently CHARACTER SET utf8.)
  • the full error message. (I guess it says "near 'CHARACTER...'".)
  • what version of MySQL you are using.

The manual say "MySQL handles strings used in JSON context using the utf8mb4 character set and utf8mb4_bin collation.", I deduce that you do not need to specify either charset or collation for datatype JSON.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • this point makes sense `The manual say "MySQL handles strings used in JSON context using the utf8mb4 character set and utf8mb4_bin collation.", I deduce that you do not need to specify either charset or collation for datatype JSON` – danish_wani Nov 15 '21 at 12:30
  • @danish_wani - That is a simple question on the surface, but it is complex under the covers. I don't know the answer offhand; I'll do some research. – Rick James Nov 15 '21 at 21:41
  • @danish_wani - I am pretty sure that, at least in MySQL 8.0, you are forced into utf8mb4 and don't need to do anything. (I tried a table with a _different_ default charset; it was happy. However that may not be the only test needed.) – Rick James Nov 15 '21 at 21:50
  • Yes, my problem was storing currency symbols into the MySQL DB with default charset as latin1, but it threw error on most of the symbols then I thought I may have to change the charset but when I inserted the same currency symbols data into the json field without changing the character set it worked fine and stored the symbols as they were. – danish_wani Nov 16 '21 at 05:48
  • 1
    @danish_wani - It seems that a column with datatype `JSON` is encoded using utf8mb4 (no choice), while `VARCHAR` and `TEXT` allow setting a charset (such as latin1). – Rick James Nov 16 '21 at 17:06
  • Yes, that's true. Thank you. Appreciated – danish_wani Nov 16 '21 at 17:09
  • @danish_wani - Thank you. I learned something. – Rick James Nov 16 '21 at 17:26