0

This might be a dummy question, but I'm a little lost in it.

How are Arabic questions exactly stored in a database ?

Let's take ب, if I insert that directly in the DB it becomes ?. Not good.

If I use a form (and php script) and store it as UTF-8, it is stored like ب. I can read it out and print it out, all good.

So my question is, are Arabic (and Japanese,...) letters always stored likes this in a mysql database ب ? Or should I change a setting somewhere and it should look like ب when I'm browsing the database?

It's just to define the length of my rows (varchars/chars) in the database...

DB set to utf8_general

Site fully UTF8

Nicolas.
  • 453
  • 1
  • 5
  • 27

1 Answers1

2

If you try to store a UTF-8 encoded character and it becomes ?, this means MySQL did not understand or support the encoding in which you sent the character. The column needs to be set to store utf8 data (better utf8mb4 if supported) and the connection encoding needs to be set to the correct encoding to inform MySQL in what encoding you're sending data to it.

If you get HTML entities from a form submission, this means the browser tried to send data in an encoding which did not support that particular character; therefore it had to fall back on HTML entities to encode the character. You need to set the encoding declarations correctly to tell the browser it should send UTF-8 encoded text to the server.

See Handling Unicode Front To Back In A Web App and/or UTF-8 all the way through for how to do all this.

Community
  • 1
  • 1
deceze
  • 510,633
  • 85
  • 743
  • 889
  • It's working now! I forgot to set the field UTF-8 aswell. And my Ajax script isn't sending UTF8 for a reason I still don't know http://stackoverflow.com/questions/16962354/ajax-post-method-with-utf-8 – Nicolas. Jun 06 '13 at 15:31