I coded a php project under ISO 8859-1, and for some technical reasons I want to encode the project under UTF-8. what is a better way to do it? I am afraid of loosing special characters like french accents and so on. thanks for you advice.
-
this might help: http://stackoverflow.com/questions/910793/php-detect-encoding-and-make-everything-utf-8 – Aziz Nov 30 '09 at 23:19
-
I assume that you're talking about saving PHP source code files as UTF-8 instead of ISO-8859-1: Have you tested it anyway? ISO-8859-1 characters falls in the same UTF-8 range as well (but not vice versa). If so, what problems exactly did you have when converting? – BalusC Nov 30 '09 at 23:20
-
@BalusC That is not entirely true. They both have a common subset, known as ascii, but half of iso-8859-1 is encoded different in utf-8. – troelskn Dec 01 '09 at 00:01
-
True, but this does not apply if you use a tool which can open them as ISO-8859-1 and save them as UTF-8. The other way round isn't possible. The average text editor/IDE can perfectly do that. – BalusC Dec 01 '09 at 00:49
-
thanks Guys for advices. I changed the encoding from the editor, and copied all of my old files into new files. this is odd, but I didn't want to write extra code line to decode/encode. – P.M Dec 01 '09 at 18:18
3 Answers
transcode all the files with iconv. change any and all http headers or meta tags. profit.

- 18,602
- 6
- 51
- 60
You should try using the shell command iconv to encode the php files from latin1 (ISO-8859-1) to UTF-8.
After that you should be sure that PHP uses UTF-8 as the default encoding (default_encoding variable in php.ini if I recall correctly). If not, then you can set it with ini_set() for your project.
After that you should convert your database to UTF-8 or use a quickfix like this (for MySQL):
mysql_query("SET NAMES 'utf8'");
Of course you just substitute mysql_query() for whatever framework you use (if you use any). Put it into your primary file which includes all the classes and stuff.

- 935
- 1
- 7
- 14
-
Thanks Kristinn for the help, I changed the environment. the next time I will take care of encoding stuff. – P.M Dec 01 '09 at 18:20
-
Here's my take on your question - you want the generated HTML (via PHP) to be UTF-8 compliant? Be aware that the HTML 4.x standard is based on iso-8859-1 and it's unclear if XHTML is based on utf-8 or iso-8859-1. Of course, pure XML is utf-8.
(1) So the first piece of the puzzle is to select your DOCTYPE
for your rendered HTML.
(2) Make sure you add the the language character set meta tags (charset=utf8
), etc.
(3) Take the rendered PHP/HTML string and send it through iconv
either via the shell using a system call or through some PHP API method.
The resulting rendered HTML will be utf-8 encoded. The client browser needs to be set to render the HTML by means of utf-8 and not western latin1. Otherwise you get a strange non-breaking space character in the upper left hand corner of the page.

- 589
- 4
- 9
-
And there's always the quick and dirty way - send the rendered HTML through MySQL using a dumb query - e.g. SELECT \
as 'html'. This assumes you have MySQL and have it's character encoding defaulting to utf-8 (set names works also). – tracy.brown Dec 01 '09 at 00:49