1
  • So I have a bunch of MySQL tables who's varchar fields are utf8_general_ci
  • The default_charset for php is utf8
  • Lastly, i set the charset to utf8 in my html header <meta charset="UTF-8">

The text in this database is written in several different languages from English to Chinese.

A lot of the 'e's have accents over them and they use funky quotes.

With no fluff, these special characters are all output to the screen as a question mark in a black diamond (except 2).

By simply wrapping the variables in utf8_encode($string) all is well... Again except for those 2.

First question: Why do I need to use utf8_encode if everything is already set that way?

Second question: Several entries use ♀ and ♂ (they're stored in the database just like that). These show up as simple question marks (no black diamond). It does not matter what I do to them, they will not change. I've tried every possible combination of utf8_encode, utf8_decode, htmlspecialchars, and htmlspecialchars_decode. Nothing. The ONLY solution is to change the database entry to use &#9792 for ♀ for example, then without any fluff it is output right. Why?

mister martin
  • 6,197
  • 4
  • 30
  • 63
  • You'll need to use `utf8_encode` if your data isn't UTF-8, and if it isn't it means something slipped in your chain from HTML -> Browser -> Server -> PHP -> Database Connection -> Database. – tadman Nov 22 '16 at 21:23
  • Could you elaborate? What could I have missed? This also doesn't explain why the two male and female characters don't work – Antonio Anonymous Nov 22 '16 at 21:24
  • You need to ensure each step in the chain is UTF8 or something will be broken. I'm not sure where your problem is, but you'll have to carefully test each phase. – tadman Nov 22 '16 at 21:40

1 Answers1

0

For Chinese and Emoji, you need utf8mb4 instead of utf8.

For black diamonds, search for such in this Q&A . It says

  • The connection (or SET NAMES) for the SELECT was not utf8mb4. Fix this.
  • Also, check that the column in the database is CHARACTER SET utf8mb4.

Since you seem to be using PHP, but did not say whether you are using mysqli or PDO, I will refer you to PHP checklist .

If you have &#9792;, then somewhere you converted the symbols into "html entities", perhaps with PHP's htmlentities()? Don't use any conversion tools in the client, it only makes things worse.

Community
  • 1
  • 1
Rick James
  • 135,179
  • 13
  • 127
  • 222