4

i have a json file with unicode characters, and i'm having trouble to parse it. I've tried in Flash CS5, the JSON library, and i have tried it in http://json.parser.online.fr/ and i always get "unexpected token - eval fails"

I'm sorry, there realy was a problem with the syntax, it came this way from the client.

Can someone please help me? Thanks

André Alçada Padez
  • 10,987
  • 24
  • 67
  • 120
  • Are you sure that it's a unicode issue, and not a problem with the json data itself ? (syntax error, incomplete file...) – Jem May 16 '11 at 15:15

6 Answers6

6

Quoth the RFC:

JSON text SHALL be encoded in Unicode. The default encoding is UTF-8.

So a correctly encoded Unicode character should not be a problem. Which leads me to believe that it's not correctly encoded (maybe it uses latin-1 instead of UTF-8). How did you create the file? In a text editor?

Mike Baranczak
  • 8,291
  • 8
  • 47
  • 71
2

There might be an obscure Unicode whitespace character hidden in your string.

This URL contains more detail:

http://timelessrepo.com/json-isnt-a-javascript-subset

knb
  • 9,138
  • 4
  • 58
  • 85
2

In asp.net you would think you would use System.Text.Encoding to convert a string like "Paul\u0027s" back to a string like "Paul's" but i tried for hours and found nothing that worked.

The trouble is hardcoding a string as shown above already decodes the string as you will see if you put a break point on it so in the end i wrote a function that converts the Hex27 to Dec39 so that i ended up with HTML encodeing and then decoded that.

 string Padding = "000";
                for (int f = 1; f <= 256; f++)
                {
                    string Hex = "\\u" + Padding.Substring(0, 4 - f.ToString().Length) + f;
                    string Dec = "&#" + Int32.Parse(f.ToString(), NumberStyles.HexNumber) + ";";
                    HTML = HTML.Replace(Hex, Dec);
                }
                HTML = System.Web.HttpUtility.HtmlDecode(HTML);

Ugly as sin, I know but without using the latest framework (Not on ISP's server) it was the best I could do and someone must know a better solution.

Flexo
  • 87,323
  • 22
  • 191
  • 272
Flash
  • 31
  • 1
0

I had the same problem and I just change the file encoding type Mac-Roman/windows-1252 to UTF-8.. and it worked

Chamira Fernando
  • 6,836
  • 3
  • 23
  • 23
0

I had the same problem with Twitter json files. I was parsing them in Python with json.loads(tweet) but it failed for half of the records.

I changed to Python3 and it works well now.

Ash
  • 3,428
  • 1
  • 34
  • 44
0

If you seem to have trouble with the encoding of a JSON file (i.e. escaped codes such as \u00fc aren't displayed correctly regardless of your editor's encoding setting) generated by Python with json.dump s(): it encodes ASCII by default and escapes the unicode characters! See python json unicode - how do I eval using javascript (and python: json.dumps can't handle utf-8? and Why does json.dumps escape non-ascii characters with "\uxxxx").

handle
  • 5,859
  • 3
  • 54
  • 82