0

I am trying to retrieve some data from a MySQL database (version 5.5.35) using the mysql npm module in NodeJS (14.5.0).

1.) Here are the database's character set variables:

character_set variables

2.) Server's encoding:

enter image description here

3.) Here are the settings of the table I am fetching data from:

enter image description here

4.) Here is how I am initiating the connection:

const mysql = require('mysql');
const dbConfig = require('../../config/database.json');

dbConfig.connectionLimit = 10;
dbConfig.charset = 'utf8' // tried latin1, latin2 and utf8mb4

console.log(dbConfig);

const connection = mysql.createPool(dbConfig);
 
module.exports = connection;

5.) Here's the result wihtout any conversion:

enter image description here

That field's value should be Cieśliński.

That field's value right now is Cieœliñski.

6.) Here's the result when I set dbConfig.charset to different encodings:

  • latin1: Cie�li�ski.

  • latin1, then conversion from latin1 to utf8 with Iconv: Cie�li�ski.

  • utf8: Cieœliñski

  • utf8, then conversion from latin1 to utf8 with Iconv: CieÅ“liñski.

Proper value should be: Cieśliński.

Question: How should I set up my NodeJS server to properly fetch the proper value of the field? (Cieśliński)

NakedCat
  • 852
  • 1
  • 11
  • 40

1 Answers1

0

Hex 9CF1:

latin1:  ϖ
cp1250:  śń

That is, somewhere in the specification, you are using latin1 when you should be specifying cp1250.

UTF-8 (utf8, utf8mb4) is not involved.

After you have provided SELECT col, HEX(col) ..., I can tell you whether it is messed up in the table or not.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Sorry for responding late, here is the result: `Cieœliñski`, `4369659C6C69F1736B69` – NakedCat Oct 12 '20 at 06:21
  • Note the `9C` and `F1` in that hex string. – Rick James Oct 12 '20 at 20:49
  • This would mean that it's a Windows 1250 encoding. But whenever I convert `œñ` from latin1 to Windows 1250 with `Iconv` I get gibberish. EDIT: I actually get an ELISEQ error from what I remember. Can't check it right now. – NakedCat Oct 13 '20 at 09:12
  • Don't use any conversion routine; just tell MySQL the correct encoding. – Rick James Oct 13 '20 at 19:45
  • Interestingly - when I connected to the database using `cp1250` without any conversion routine and I turn the text into HEX inside of my application (instead of doing it directly in MySQL) I get this HEX: `4369653f6c693f736b69`. This is the field's value without any conversion routine: `Cie?li?ski`. – NakedCat Oct 14 '20 at 08:31
  • @Pe-Ter - That's because the configuration is misrepresenting what the encoding for displaying. Check `SHOW VARIABLES LIKE 'char%';` and the connection parameters for node.js and the them `` if relevant. I _think_ any of those could cause the "question marks". – Rick James Oct 14 '20 at 16:28