0

While trying to use the LOAD DATA INFILE command in MYSQL (InnoDB), I had trouble getting the encoding to work.

"something","something else",\N,"ANOTHER","2012-05-05T19:54:03","2012-12-08T16:14:53","SOMETHING","Something","Hello","HIHI","HEY","999.0","0.01","0.25","06/2012",\N,"2012-06-28","2012-06-28","2012-06-28","2009-03-02","2012-06-28",\N,"LOLOL","","LOLNON",\N,

Became

獯浥瑨楮朢Ⱒ獯浥瑨楮朠敬獥       䅎佔䡅刢Ⱒ㈰ㄲⴰ㔭〵吱㤺㔴㨰㌢Ⱒ㈰ㄲⴱ㈭〸吱㘺ㄴ㨵㌢Ⱒ协䵅呈䥎䜢Ⱒ卯浥瑨楮朢Ⱒ䡥汬漢Ⱒ䡉䡉   䡅夢Ⱒ㤹㤮〢Ⱒ〮〱   \N  \N  ㈰ㄲⴰ㘭㈸   ㈰〹ⴰ㌭〲   ㈰ㄲⴰ㘭㈸       䱏䱏䰢Ⱒ    0   0   \N  \N  \N  \N  \N  \N  \N  \N  \N  \N  \N  \N  \N  \N

Other people had similar problems, which were solved, as was mine, by setting the charset to 'utf8'. (like this one and this one) In none of the questions or answers was it clear why utf8 wouldn't work automatically.

This seemed strange, since people often mentioned that they were using utf8 as a default encoding, and I was encoding the data in UTF-8 format in Notepad++ with a table default collation of "utf8 - default collation".

I looked further and saw that my schema default charset is utf16 with a default collation of utf16_general_ci. I believe that MYSQL uses the schema default charset and collation for LOAD DATA INFILE commands.

Does MYSQL use the schema charset for defaults on LOAD DATA INFILE, and if so, where is that documented? If not, where does the default charset come from?

Community
  • 1
  • 1
Jeutnarg
  • 1,138
  • 1
  • 16
  • 28

1 Answers1

0

Basically yes.

The server uses the character set indicated by the character_set_database system variable to interpret the information in the file. SET NAMES and the setting of character_set_client do not affect interpretation of input. If the contents of the input file use a character set that differs from the default, it is usually preferable to specify the character set of the file by using the CHARACTER SET clause. A character set of binary specifies “no conversion.”

and this is found right near the top of the documentation for LOAD DATA INFILE

If you set that variable dynamically, then you could claim that you're not using the schema default, but that would be stretching the point.

Jeutnarg
  • 1,138
  • 1
  • 16
  • 28