0

I have the following JSON:

{"StationRow":[{"Title":"XXX"},X{"Thumbnail":"http://exampletv.com/shopping/Portals/10/PropertyAgent/757/Images/6.jpg"},{"LinkCode":"http://www.youtube.com/watch?v=J4bw4y3h69I http://www.youtube-nocookie.com/embed/J4bw4y3h69I?rel=0"},{"SourceType":"embed"},{"LinkURL":"http://www.youtube.com/watch?v=J4bw4y3h69I"},{"Title":"ΚΛΕΜΜΕΝΑ ΟΝΕΙΡA"},{"Description":"XXXX."},{"Thumbnail":"http://exampletv.com/shopping/Portals/10/PropertyAgent/757/Images/14.jpg"},{"LinkCode":"ΚΛΕΜΜΕΝΑ ΟΝΕΙΡΑ - ΕΠ. 293 ΑΠΟΣΠΑΣΜΑ, http://www.youtube.com/watch?v=wSrhamIIaR4, http://exampletv.com/shopping/Portals/10/PropertyAgent/757/Images/17.jpg, ΚΛΕΜΜΕΝΑ ΟΝΕΙΡΑ - ΕΠ. 292 ΑΠΟΣΠΑΣΜΑ, http://www.youtube.com/watch?v=jxxhttp://exampletv.com/shopping/Portals/10/PropertyAgent/757/Images/16.jpg, ΚΛΕΜΜΕΝΑ ΟΝΕΙΡΑ - ΕΠ. 291 ΑΠΟΣΠΑΣΜΑ, http://www.youtube.com/watch?v=xx, http://exampletv.com/shopping/Portals/10/PropertyAgent/757/Images/15.jpg, xx ΟΝΕΙΡΑ - ΕΠ. 290 ΑΠΟΣΠΑΣΜΑ, http://www.youtube.com/watch?v=ILcwh7tMJ2Y, http://exampletv.com/shopping/Portals/10/PropertyAgent/757/Images/14.jpg, "},{"SourceType":"embed"},{"LinkURL":""}]}

On using simplejson it throws following exception:

 NOTE: IGNORING THIS CAN LEAD TO MEMORY LEAKS!
                                            Error Type: <class 'simplejson.scanner.JSONDecodeError'>
                                            Error Contents: Invalid control character 'h' at: line 1 column 260 (char 259)
                                            Traceback (most recent call last):

JSONLint calls it ValidJSON

How to figure it out?

I am using Python 2.6 ships with XBMC

Volatil3
  • 14,253
  • 38
  • 134
  • 263
  • This JSON (copied and pasted) is parsed fine using the json library in Python 2.7 and 3.3 (which is basicilly simplejson). – Thayne Feb 01 '14 at 08:09
  • so is this version issue? – Volatil3 Feb 01 '14 at 08:11
  • Looking at the error statement, it looks like you may have a non-printing control character in your input. It isn't in the JSON you posted, but may be in your original text. The first thing I would do is open the JSON in a text editor that can show you all control characters, and see if there is one, either delete it or escape it. – Thayne Feb 01 '14 at 08:14
  • 1
    btw that "control character 'h'" probably refers to the ASCII Backspace control character. – Thayne Feb 01 '14 at 08:15
  • JSONLInt validate it: http://jsonlint.com/ Refer to debugging, in which editor should I open it? i am on Mac – Volatil3 Feb 01 '14 at 08:16
  • vim will work, if you know how to use it. – Thayne Feb 01 '14 at 08:23
  • Ironically `:set list` does not show any such thing – Volatil3 Feb 01 '14 at 08:28

2 Answers2

3

Save your json as text.json and type this on your terminal app:

cat text.json | od -c

It will produce something like this:

0000000    {   "   S   t   a   t   i   o   n   R   o   w   "   :   [   {
0000020    "   T   i   t   l   e   "   :   "   Σ  **   Υ  **   Ν  **   Τ
0000040   **   Α  **   Γ  **   Ε  **   Σ  **       Ε  **   Λ  **   Λ  **
0000060    Η  **   Ν  **   Ι  **   Κ  **   Ε  **   Σ  **   "   }   ,   {
0000100    "   D   e   s   c   r   i   p   t   i   o   n   "   :   "   Σ
0000120   **   Υ  **   Ν  **   Τ  **   Α  **   Γ  **   Ε  **   Σ  **    

Then you can go to 259 position and see what is going on.

Burhan Khalid
  • 169,990
  • 18
  • 245
  • 284
Auro
  • 111
  • 1
  • 3
  • Amazing. Is every column represent same col# as given in error? in a line I find column lesser than 2o? – Volatil3 Feb 08 '14 at 10:53
  • I think the output from od changes columns to its formatting. I think it is better you try to find the strings after or before original file character position. – Auro Feb 08 '14 at 18:40
  • I did by just doing `mystring[positionIndex]` – Volatil3 Feb 08 '14 at 19:01
0

You might want to try the other approach, that is to remove all non-printable characters from the string. Modified from Stripping non printable characters from a string in python:

import re

def remove_control_chars(s):
    control_chars = ''.join(map(unichr, range(0,32) + range(127,160)))
    control_char_re = re.compile('[%s]' % re.escape(control_chars))

    return control_char_re.sub('', s)

cleaned_json = remove_control_chars(original_json)
obj = simplejson.loads(cleaned_json)
Community
  • 1
  • 1
Burhan Khalid
  • 169,990
  • 18
  • 245
  • 284