1

According to MySQL manual, MySQL includes character set support that enables us to store data using a variety of character sets and perform comparisons according to a variety of collations. Character sets can be specified at four different levels:

  • Server
  • Database
  • Table
  • Column

Assuming I have a database that stores the following:

  • User ID (INT)
  • Email Address (VARCHAR 50)
  • User profile (TEXT - multi-language)
  • System flag (CHAR 1 - a-z only)

Between Latin1 and UTF-8, which should I choose for the four different levels to achieve the best possible performance?

ADD NOTE: This is just a simplified example. In real scenario, I would expect several columns storing (a-zA-Z0-9) and one or two columns storing multi-lingual text. That is why I am concerned about performance.

ADD NOTE2: I am referring to a database that stores millions of records. That is why performance matters to me.

Question Overflow
  • 10,925
  • 18
  • 72
  • 110

2 Answers2

1

I might be wrong, but from my experience the character set of your choice doesn't really have a big impact on your overall database performance (if you start mixing them up in different tables, now that might affect query performance).

If you want to support multiple languages, go for utf8 (or even utf16).

Bjoern
  • 15,934
  • 4
  • 43
  • 48
0

You should choose the same encoding for the whole database. Otherwise you as a developer will be confused later. And since the text is multilingual, is only leaves utf8 as the encoding of your choice.

Note that you can choose an encoding for the database connection, too.

Roland Illig
  • 40,703
  • 10
  • 88
  • 121
  • Yes, I think server refers to database connection or instance, and database refers to schema on the manual. I am not sure about the performance impact since there will be millions of rows to deal with. – Question Overflow Nov 15 '11 at 08:02