4

i just want to know about the language transation for the Japanese, 1) Which is the best encoding for the database mysql 2) Which/how can i print that in HTML page. ? thanks in advance.

Makoto
  • 104,088
  • 27
  • 192
  • 230
coderex
  • 27,225
  • 45
  • 116
  • 170
  • I tagged this as language-agnostic since "Japanese" can be really replaced with any other language here. – Quassnoi Jun 25 '09 at 18:05
  • "any other language", really? I was under the impression that Japanese had a considerably larger character set than most other languages... silly me. – bendin Jun 25 '09 at 18:28
  • Um, "language-agnostic" refers to programming languages, not natural languages. Come to think of it, so does "language", but it's too vague to be of any use. – Alan Moore Jun 25 '09 at 18:38
  • @Alan M: I knew it was a lame joke :) – Quassnoi Jun 25 '09 at 18:59

3 Answers3

10

UTF-8 without a doubt. Make everything UTF-8. To put UTF-8 encoded text on your web page, use this within your HEAD tag:

<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />

As for MySQL, put the following into your my.cnf (config) file:

[mysqld]
collation_server=utf8_unicode_ci
character_set_server=utf8
default-character-set=utf8
default-collation=utf8_general_ci
collation-server=utf8_general_ci

If you're getting garbage characters from the database from queries executed by your application, you might need to execute these two queries before fetching your Japanese text:

SET NAMES utf8
SET CHARACTER SET utf8
karim79
  • 339,989
  • 67
  • 413
  • 406
  • It's easier to just set the default_charset in php ini_set( 'default_charset', 'UTF-8' ); – Peter Bailey Jun 25 '09 at 18:04
  • Great but he hasn't mentioned PHP anywhere, and putting a meta tag into a document is much simpler than tweaking a setting. Plus, that could fail if he's on a shared host. – karim79 Jun 25 '09 at 18:08
  • Wait, it's there now, either I didn't see it or it was just added. – karim79 Jun 25 '09 at 18:15
5

Make Sure

  1. Database is in UTF8
  2. Database Table is in UTF 8
  3. Output Headers are in UTF 8
  4. HTML Meta Tag is in UTF 8

When everything is talking the encoding you can live happily :)

For MySQL: utf8 charset, utf8_general_ci collation For PHP headers:

header('Content-type: text/html; charset=UTF-8') ;

For HTML

<meta http-equiv="Content-type" value="text/html; charset=UTF-8" />
Shadi Almosri
  • 11,678
  • 16
  • 58
  • 80
  • thank you very much mr Shadi Almosri...thank you very much.... and could you please tell me.. 1. Japanese - 2. English 3. French 4. Italian 5. Turkish 6. German 7. Spanish for all these language into a single system which is more suitable encoding ??? – coderex Jun 25 '09 at 18:33
  • 1
    UTF8 is made to handel them all very well, i recently released a website in 6 languages including polish, german, spanish, french, italian, greek. UTF 8 will cover the languages you've specified. – Shadi Almosri Jun 25 '09 at 19:12
  • is there any issue with any browser to display these many languages, i meant in some times i have problems...how you solve that kind of issu with unsupported font characters? – coderex Jun 25 '09 at 19:33
  • Just make sure you use a font that supports such charecters in your page. In this case using something like arial is a safe bet, you can see this by simply going to a web page such as: http://www.google.com/search?q=japanese+website you'll see you don't need anything to display them. Just ensure all your encoding in every part of your system uses UTF8 and then use a standard font such as arial for the web page display. – Shadi Almosri Jun 25 '09 at 19:50
  • Also the link provided by Peter Bailey is a very useful step by step guide to ensuring the setup i've outlined above is correct. http://developer.loftdigital.com/blog/php-utf-8-cheatsheet – Shadi Almosri Jun 25 '09 at 19:52
  • ok... thank you.. it will be very helpfull for me.. and One more thing.. how can i set this in PGSQL ?? collation_server=utf8_unicode_ci character_set_server=utf8 default-character-set=utf8 default-collation=utf8_general_ci collation-server=utf8_general_ci ???? – coderex Jun 25 '09 at 19:52
  • What do you use to create your postgres tables? do you have a control panel such as phppgadmin or are you using pure SQL statements? – Shadi Almosri Jun 25 '09 at 20:09
  • Here is a link that might help you for postgresql setup of a unicode db: http://www.olat.org/docu/install/Database_and_UTF-8_configuration.html The line of interest is: CREATE DATABASE testdb WITH OWNER = replace_with_your_admin_user_eg_pgsql ENCODING = 'UNICODE' TABLESPACE = pg_default; – Shadi Almosri Jun 25 '09 at 20:11
  • am useing another pgsqladmin III application – coderex Jun 25 '09 at 20:15
  • Well there should be a function within it when creating the database to set the "Encoding" to "Unicode", that will have the same affect as when creating the MySQL tables. If all the information so far has been helpful to get you carrying out the requires features, then please tick the answer as "Answered" Shadi – Shadi Almosri Jun 25 '09 at 20:44
1

Update... This Q&A suggests that CHARACTER SET utf8mb4 COLLATION utf8mb4_unicode_520_ci is best in newer versions of MySQL.

Community
  • 1
  • 1
Rick James
  • 135,179
  • 13
  • 127
  • 222