6

I have a field in a MySQL database (utf8_general_ci) that contains a curly (smart?) apostrophe: Owner’s...

This prints fine with no special handling if I access the PHP page that pulls it from the DB. However, I am trying to access it via a $.getJSON request on another page, so I used PHP's json_encode. It truncates the value so that it reads Owner, then successfully encodes the rest of the data. If I use PHP's utf8_encode on the field before I json_encode, it includes the full value with the encoded to \u0092 which then doesn't print anything on the page, giving me Owners. PHP's htmlentities and htmlspecialchars have no effect.

Looking at the request in Chrome's tools, Owner’s is shown as Owner�s on the $.getJSON page.

Can anyone help me out here? I have read other questions on SO and the web but I cannot find anything that helps and I haven't worked much with JSON.

Thanks for reading.

Sarah Kemp
  • 2,670
  • 3
  • 21
  • 29
  • `If I use PHP's utf8_encode on the field before I json_encode, it includes the full value with the ’ encoded to \u0092` that sounds like the way to go, that looks right. Can you elaborate on how that fails exactly? What does this scenario look like in Chrome's tools? – Pekka Feb 26 '13 at 19:25
  • I think you need to set the encoding for the page you are outputting too. Also, you might need to decode the UTF-8. – crush Feb 26 '13 at 19:40
  • @Pekka: On the page, `Owner\0092s` is rendered as `Owners`, in Chrome's Tools, it looks like `Owner\0092s`. @crush, I am concerned this is not converting correctly because a search for \0092 indicates it is a control character (?) not an apostrophe? http://stackoverflow.com/questions/11030851/jquery-parsejson-u0092-character-is-not-parsed – Sarah Kemp Feb 26 '13 at 19:56
  • 1
    Check out [UTF-8 all the way through](http://stackoverflow.com/q/279170) you may have a connection problem (ISO-8859-1 charset) – Pekka Feb 26 '13 at 19:58

2 Answers2

11

For details: json_encode

Example:

echo json_encode($array, JSON_HEX_TAG | JSON_HEX_APOS | JSON_HEX_QUOT | JSON_HEX_AMP | JSON_UNESCAPED_UNICODE);
itsazzad
  • 6,868
  • 7
  • 69
  • 89
  • @delive You may try with the necessary options `JSON_HEX_QUOT, JSON_HEX_TAG, JSON_HEX_AMP, JSON_HEX_APOS, JSON_NUMERIC_CHECK, JSON_PRETTY_PRINT, JSON_UNESCAPED_SLASHES, JSON_FORCE_OBJECT, JSON_PRESERVE_ZERO_FRACTION, JSON_UNESCAPED_UNICODE, JSON_PARTIAL_OUTPUT_ON_ERROR ` : http://php.net/manual/en/function.json-encode.php – itsazzad Mar 22 '16 at 18:38
  • Thanks. I didn't know about `JSON_UNESCAPED_UNICODE`. – Ryan May 14 '19 at 15:51
8

Using PHP's utf8_encode() before my json_encode() did indeed stop the data from cutting off after the but it also encoded it to \0092 which did not display (control character). When I used MySQL's SET NAMES utf8 before my query, I did not have to use utf8_encode() at all, and my json was encoded correctly with mapping to \u2019, which displays nicely.

Thanks for the link @Pekka, it helped me narrow down the possibilities.

Sarah Kemp
  • 2,670
  • 3
  • 21
  • 29