-6

I'm importing a csv saved as utf8 in linux, which looks fine. The tables in the mysql db are set for utf8 and so is the connection collation. I'm importing using "CSV using LOAD DATA" with the character set as UTF8. Yet the characters are being changed. eg ∙ becomes ∙ . What could cause this?

1 Answers1

0

A possibility is that your MySQL database uses the utf8 encoding. This is because utf8 in MySQL is not equivalent to the general UTF-8 encoding, but supports only a subset of the UNICODE characters. For instance, you cannot encode mathematical characters like , the Bullet Operator.

If you want a full support for all the UNICODE characters, you should try to use the utf8mb4 encoding instead, as it is shown in the MySQL manual:

utf8, a UTF-8 encoding of the Unicode character set using one to three bytes per character.

utf8mb4, a UTF-8 encoding of the Unicode character set using one to four bytes per character.

You can see also the full discussion about UNICODE support in the manual.

Community
  • 1
  • 1
Renzo
  • 26,848
  • 5
  • 49
  • 61
  • You should change the setting inside the database itself, changing the encoding of the tables, not the import settings of the command. – Renzo Aug 21 '16 at 07:15
  • See for instance [this](http://stackoverflow.com/a/38284057/2382734) or similar posts about changing MySQL encoding. – Renzo Aug 21 '16 at 07:19
  • if I change the settings in the columns effected and upload using that setting it doesnt make any difference. I dont see a way to change the whole db, maybe that needs to be done. I have uploaded to this db before without problems using utf8, altho the csv would have been made on windows. – Rafael Valentino Aug 21 '16 at 07:42
  • I found a work around, but due to the question being downvoted so much I will be posting it elsewhere, although I thank Renzo for his input – Rafael Valentino Aug 21 '16 at 09:40
  • The "bullet operator", hex `E28899` is available in MySQL's utf8; utf8mb4 in not requires. – Rick James Aug 22 '16 at 04:22