0

in a C++ software using the libmysqlclient, I am trying to insert a row into a table containing html-encoded characters such as é. The request looks like this:

INSERT INTO mytable ( field )  VALUES( 'Justificatif d\'achat numéro 1523641305' )

It fails with error:

Incorrect string value: '\xE9ro 15...' for column 'field' at row 1

If I copy my request such as, copy it into workbench and execute it, it works. But it is impossible to have it work from my code.

My column is in charset utf8 with default collation.

I connect to my DB using mysql_real_connect, and call "SET NAMES 'utf8'" just after having performed the connection.

First edit: I have read the questions and answers from Inserting UTF-8 encoded string into UTF-8 encoded mysql table fails with "Incorrect string value", and tried to apply the suggested solutions, but it did not work for me. As far as I understand the problem, my issue is different since the character I try to insert is U+00E9 in Unicode, and therefore is only on 2 bytes. Also, I am able to insert it into my DB, just not from my C++ code.

Matthieu.V
  • 160
  • 9
  • **Also read:** https://medium.com/@adamhooper/in-mysql-never-use-utf8-use-utf8mb4-11761243e434 – Lightness Races in Orbit May 28 '19 at 10:18
  • 2
    Then you'll need to present your [MCVE]. – Lightness Races in Orbit May 28 '19 at 12:58
  • As @LightnessRacesinOrbit suggests, since the problem is with your C++ application you need to include a reproducible example of what you're doing in C++. However, psychic debugging: you said you perform `"SET NAMES 'utf8'"` **after** making your connection; but according to https://stackoverflow.com/questions/8112153/process-utf-8-data-from-mysql-in-c-and-give-result-back you need to make your calls to `mysql_options` after `mysql_init` but before `mysql_real_connect`. – Paul Wheeler May 28 '19 at 21:36
  • The dup (https://stackoverflow.com/questions/11936950/inserting-utf-8-encoded-string-into-utf-8-encoded-mysql-table-fails-with-incorr) involves 4-byte codes of utf8mb4; that is not the case here. So I am reopening. – Rick James May 30 '19 at 05:24

1 Answers1

0

e-acute:

  • htmlentities: é -- Avoid this in databases.
  • latin1: hex E9
  • utf8: hex C3A9
  • Unicode "codepoint": U+00E9 -- Avoid this in databases.

When establishing the connection from your C++ client to MySQL, state what character encoding is being used in the client. Based on "Incorrect string value: '\xE9ro...", I assume it is latin1.

Separately, you can declare the column (field) in the table to be either CHARACTER SET latin1 or utf8 or utf8mb4. In the first case, the E9 will pass through unchanged. In the others, the E9 will be turned into C3E9.

Rick James
  • 135,179
  • 13
  • 127
  • 222