0

I'm experiencing some problems with special characters. When a person tries to register on my site using for example the letter 'ö', I get this error message:

1366 - Incorrect string value: '\xF6m' for column 'username' at row 1

insert into phpbb3_users (username, username_clean, user_password) values ('Ström', 'Ström', '$H$9iK6K37VoHM//')

I understand the character settings could be the problem and when I check the database, I can see that some tables have collation:

latin1_swedish_ci 

and some tables have collation (for example the table 'phpbb3_users' that is having problem in the example above):

utf8_bin

And when I check my MySql settings I see that I have these mixed settings:

| character_set_client | latin1 |
| character_set_connection | latin1 |
| character_set_database | latin1 |
| character_set_filesystem | binary |
| character_set_results | latin1 |
| character_set_server | latin1 |
| character_set_system | utf8 |
| character_sets_dir | /usr/share/mysql/charsets/ |
| collation_connection | latin1_swedish_ci |
| collation_database | latin1_swedish_ci |
| collation_server | latin1_swedish_ci 
  1. What character_set do you recommend me to use? I have learned that changing character_set to utf8mb4 could solve the problem, but does my MySql version (5.1.4) support that?
  2. Were should I make character_set changes? In all instances of the /etc/my.cnf and all table collations?

Thanks.

Carl
  • 149
  • 10

1 Answers1

2

You need to:

  1. set the character set of the column (or the next higher default charset of the table/database/server) to a character set that supports the character you want to store. If you want to store "ö", latin1 will do. If you want to store more "exotic" characters, utf8 is a great choice, because it support virtually everything.*
    Note that the collation like latin1_swedish_ci is pretty much irrelevant. The collation follows from the charset you choose, but the charset is first and foremost the import variable here.
  2. set the connection charset to the charset you're going to send text in. If you're going to send UTF-8 encoded text to the database, you need to tell it that by setting the connection charset to utf8. This is most likely where your problem is. How to set it depends on how you connect to the database. mysql_set_charset, a charset parameter in your DSN connection string or a SET NAMES utf8 query will do.
  3. send text encoded in the encoding you declared in step 2.

* MySQL's utf8 is not "real" UTF-8, supporting only a subset of all of Unicode. utf8mb4 is "real" UTF-8, supporting everything. utf8mb4 is not available on MySQL 5.1, only 5.5+.

For more information, see UTF-8 all the way through, Handling Unicode Front To Back In A Web App, What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text.

Community
  • 1
  • 1
deceze
  • 510,633
  • 85
  • 743
  • 889
  • Thanks! **1.** Does that mean I should update MySQL settings to: `| character_set_client | **utf8** | | character_set_connection | **utf8** | | character_set_database | **utf8** | | character_set_filesystem | binary | | character_set_results | **utf8** | | character_set_server | **utf8** | | character_set_system | utf8 | | collation_connection | **utf8** | | collation_database | **utf8** | | collation_server | **utf8**` **2.** Should I update all table and column collations in phpMyAdmin from latin1_swedish_ci to utf8_bin? Don't think utf8 exists there. – Carl Oct 15 '13 at 11:12
  • 1. You don't need to set all those to `utf8`, all those settings have their individual place where they're used. The important thing is that the *connection encoding parameters* are set correctly, which you best do in the client-side API (in PHP, using whatever method is used for that in your database adapter of choice). Consult the manual what setting does what if you're interested. 2. If you want to store `utf8`, your columns need to be `utf8`, yes. If `latin1` will do, that's fine too. That's up to you and what kind of text you expect to be storing. – deceze Oct 15 '13 at 11:15
  • **1.** I've decided to go for `utf8` and I will update columns to `utf8_bin` (since just `utf8` does not exist in my tables). I hope that is correct? **2.** You wrote "The important thing is that the connection encoding parameters are set correctly". Which of my MySQl's settings are "connection encoding parameters"? Or maybe it is just best to change them all to `utf8`? Thanks for helping me out deceze. – Carl Oct 15 '13 at 11:35
  • 1. `utf8_bin` is the *collation*. Choosing the collation implicitly chooses the charset, yes, but those are actually two different attributes. 2. I'm honestly not sure what each individual parameter does *exactly*, read the manual if you need to know. Again: just use whatever method your PHP database adapter wants you to use to set the connection encoding. Read the other linked pages and the manual for your database adapter. – deceze Oct 15 '13 at 13:16