0

I setup a MySQL table using Collation latin1_swedish_ci. I have coded my PHP scripts in the UTF-8 character-set. It all worked out well until I discovered that a person entered their first name with a non-swedish character. The name Helén looked strange in the database but looked OK on the webpages.

My question is, can I simply change the Collation of my table from latin1_swedish_ci to UTF-8?

Will it actually cause any problems?

Toby Allen
  • 10,997
  • 11
  • 73
  • 124
Andreas
  • 177
  • 1
  • 4
  • 13
  • its the character set not the collation at issue here –  Jun 30 '13 at 10:15
  • i think you should put this exactly after you connected to your database in your php code: mysql_query('SET NAMES utf8'); , but as i'm not sure i didn't post it as an answer – Vladimir Jun 30 '13 at 10:16
  • The `latin1` encoding can perfectly represent "é", that's not the issue. The issue is the *connection encoding.* [Handling Unicode Front To Back In A Web App](http://kunststube.net/frontback/) – deceze Jun 30 '13 at 10:18
  • @deceze Ok. So does this mean I do not actually do anything about it or is there some change I need to do? – Andreas Jun 30 '13 at 10:19
  • You need to fix the connection encoding as outlined in the linked-to sites above. The database encoding is fine; as long as you only need to store "western" characters the latin1 encoding can store. If you need to store other characters too, you should go with UTF-8. [What Every Programmer Absolutely, Positively Needs To Know About Encodings And Character Sets To Work With Text](http://kunststube.net/encoding/) – deceze Jun 30 '13 at 10:21

1 Answers1

-1

You can do it by adding

$db->set_charset("utf8"); // MySQL
mb_internal_encoding("UTF-8"); // Set PHP encoding
header("Content-Type: text/html; charset=utf-8"); // Prevent incorrect encoding in Browser

at the top and escaping every parameter using

$param=$db->escape_string($param);

Then MySQL/PHP will handle everything which is encoding-related for you. When you did this, all your stuff will be in UTF-8.

Your problem with the name is probably related to the header() command, that means that the browser choooses another charset than PHP. You can fix your data by doing that:

$data=utf8_encode($data);

If that produces bad results, try this one:

$data=utf8_decode($data);

After that everything should look fine.

If you want to make UTF8 the default charset, consider adding the following to my.cnf in [mysqld]:

# Set UTF-8 as standard
collation-server = utf8mb4_general_ci
init-connect='SET NAMES utf8mb4'
character-set-server = utf8mb4

That's what I did, I haven't had any issues since a year of development with all what I mentioned in this post.

Lorenz
  • 2,179
  • 3
  • 19
  • 18
  • `mb_internal_encoding` is only relevant for the mb_ functions, nothing else. `utf8_encode` and `_decode` are usually completely unnecessary and *encoding conversions* are best handled elsewhere. PHP never handles anything encoding related "for you". You just have to get it right at every interface between different systems. Sorry, your post is mostly a random accumulation of encoding related functions without really explaining what's happening or what needs to happen. – deceze Jun 30 '13 at 12:30