0

I have a datafile encoding by iso_1, and I changed it to UTF8:

file -i test.txt:
... text/plain; charset=utf-8

and mysql character_set is:

| character_set_client     | utf8                       |
| character_set_connection | utf8                       |
| character_set_database   | utf8                       |
| character_set_filesystem | binary                     |
| character_set_results    | utf8                       |
| character_set_server     | utf8                       |
| character_set_system     | utf8                       |
| character_sets_dir       | /usr/share/mysql/charsets/

My question is: Why the chinese character is still messy code?

ºâÑô...      
JLOGAN
  • 43
  • 11
  • 1
    What is the relationship between the file, the database, ISO_1 (which does not exist, tags say you probably meant ISO 8859-1), chinese characters and messy code? How do you store your data to the database, how do you read it, how do you print it? – Amadan Jul 31 '18 at 03:59
  • ISO-8859-1, sybase character_set is iso-8859-1, export data to a file, and i need to import data from this file to mysql. I used file -i to get the charset, and I read data by cat ...| head -n – JLOGAN Jul 31 '18 at 06:13
  • You can't have Chinese characters in ISO 8859-1. If you stored Chinese characters in UTF-8 while pretending it's ISO 8859-1, your `file -i` would likely detect it as UTF-8, if the proportion of the Chinese characters was high enough. So I'm not sure what you actually have in your file. – Amadan Jul 31 '18 at 06:19

2 Answers2

1

Which of these were you expecting?

                                     big5   6  2 '算栠'
                              gb2312, gbk   6  2 '衡阳'
                            eucjpms, ujis   6  2 '財剩'

ºâÑô is "Mojibake" for one of those. See Trouble with UTF-8 characters; what I see is not what I stored

Some of the character_set_* settings reference the encoding in the client. It is quite OK for a column to be utf8mb4 while the client is using big5 or gb2312 (etc), but you must do SET NAMES big5 or the equivalent.

Rick James
  • 135,179
  • 13
  • 127
  • 222
0

THANKS guys,I find use gb18030 covered to utf-8 worked. But I dont know why the file -i showed the file charset is iso-8859-1.

JLOGAN
  • 43
  • 11