3

I have a table encoded in latin1 and collate latin1_bin.

In my table there is a column comments of type 'TEXT', as you know this column inherits table's encoding and collation, but from now on I should change it to be utf8 and utf8_general_ci because I'm starting to store special characters in comments.

Would it cause any downside effect if I'd use a command like the following?

alter table notebooks modify comments text CHARACTER SET utf8 COLLATE utf8_general_ci;

Thank you for your answer.

phaberest
  • 3,140
  • 3
  • 32
  • 40
孙悟空
  • 1,215
  • 1
  • 11
  • 26
  • yes, you can set per-field collations. what you set on the table simply becomes the default for any fields which don't have one set explicitly. but you'd probably be better off converting the entire table anyways, and keep everything in a single charset. – Marc B Jul 08 '16 at 14:13
  • I totally agree with @MarcB. Do a local backup of your database, convert the encoding and check your data. You may follow these instructions -> http://stackoverflow.com/questions/6115612/how-to-convert-an-entire-mysql-database-characterset-and-collation-to-utf-8 – phaberest Jul 08 '16 at 14:23
  • By the way, will it SLOW DOWN my query ? – 孙悟空 Jul 08 '16 at 14:30

2 Answers2

4

Danger I think that that ALTER will destroy existing text.

Also, ... Your 'name' looks Chinese, so I would guess that you want to store Chinese characters? In that case, you should use utf8mb4, not just utf8. This is because some of the Chinese characters take 4 bytes (and are not in the Unicode BMP).

I believe you need 2 steps:

ALTER TABLE notebooks MODIFY comments BLOB;
ALTER TABLE notebooks MODIFY comments TEXT
          CHARACTER SET utf8mb4  COLLATE utf8mb4_general_520_ci;

Otherwise the latin1 characters will be "converted" to ut8. But if you really have Chinese in the column, you do not have latin1. The 2-step alter, above, does (1) turn off any knowledge of character set, and (2) establish that the bytes are really utf8mb4-encoded.

To be safer, first do

RENAME TABLE notebooks TO old;
CREATE TABLE notebooks LIKE old;
INSERT INTO notebooks SELECT * FROM old;

Then do the two ALTERs and test the result. If there is trouble, you can RENAME to get back the old copy.

Rick James
  • 135,179
  • 13
  • 127
  • 222
  • Thank you for your answer. Actually i store french accented characters like éèùçà ... so utf8 works well in my case. I was in danger, but fortunately, after doing ALTER, the existing text aren't destroyed.(i don't know why ???) thank you for utf8mb4 information. It's interesting to know that. – 孙悟空 Jul 09 '16 at 20:10
  • Alas, I still can't predict everything in this area. I'm glad the single alter did not destroy your data. – Rick James Jul 10 '16 at 00:21
0

Specifying any collating sequence that does not involve direct integral comparison of a NATIVE character set will slow down your query. Whether it will slow it down noticeably is another issue. Looking up the ranking of this, and the ranking of that, in a table and comparing the two results is much, MUCH faster than retrieving on-disk information from a database, wouldn't you imagine?