0

I'm trying to switch my site over to UTF-8 completely, so I don't have to deal with utf8_encode() & utf8_decode() functions.

I have the collation of my tables set properly, and I'm temporarily using the query SET NAMES utf8 to override the my.cnf file.

My question is — there are a ton of character set and collation variables in my.cnf, and I suspect that some ought to be left alone... which ones should I change to achieve the effect of SET NAMES utf8?

(The collation of my tables is utf8_unicode_ci.)

character_set_client | latin1 |
character_set_connection | latin1 |
character_set_database | latin1 |
character_set_filesystem | binary |
character_set_results | latin1 |
character_set_server | latin1 |
character_set_system | utf8 |

collation_connection | latin1_swedish_ci |
collation_database | latin1_swedish_ci |
collation_server | latin1_swedish_ci |
JKS
  • 3,710
  • 2
  • 29
  • 40
  • Character set and collation are not the same thing just to let you know. MySQL defaults to a collation of `latin1_swedish_ci` and a characters set of `UTF-8`. Check this question for a good description between the two http://stackoverflow.com/questions/341273/what-does-character-set-and-collation-mean-exactly – ubiquibacon Jul 28 '10 at 22:06
  • Relative answer is [here](http://stackoverflow.com/questions/3513773/change-mysql-default-character-set-to-utf8-in-my-cnf/3513812#3513812). – dma_k Nov 10 '11 at 11:17

1 Answers1

0

Well, collation is primarily for sorting, so unless you're storing a lnaguage with specific sorting needs, utf8_unicode_ci should be fine.

The character_set_* values are used for all other string operations internally - value checks in places like WHERE clauses or IF/CASE statement, string functions like CHAR_LENGTH(), REPLACE(), SUBSTRING() - that sort of stuff.

Generally speaking, they should all be the same (in this case, utf8) except for filesystem - I'd recommend keeping that at binary unless you have a specific need to move away from that.

Peter Bailey
  • 105,256
  • 31
  • 182
  • 206
  • It was definitely the filesystem/binary setting that inspired this question; that, and I couldn't find any documentation about what specifically each of these did. I mean, my instinct was just to change them all to utf8/utf8_unicode_ci — but the binary setting sapped my confidence. I was aware that collation is different, but at the same time, having latin1_swedish_ci didn't make sense to me at all, so I switched it to UTF-8 as well — if only for consistency's sake. Anyway, thanks for the answer... I think I'll change all except `character_set_filesystem`. – JKS Jul 30 '10 at 03:26