96

Below is my string that is getting printed out with the below code -

jsonString = data.decode("utf-8")

print jsonString

And below is the string that got printed out on the console -

{"description":"Script to check testtbeat of TEST 1 server.", "script":"#!/bin/bash\nset -e\n\nCOUNT=60   #number of 10 second timeouts in 10 minutes\nSUM_SYNCS=0\nSUM_SYNCS_BEHIND=0\nHOSTNAME=$hostname      \n\nwhile [[ $COUNT -ge \"0\" ]]; do\n\necho $HOSTNAME\n\n#send the request, put response in variable\nDATA=$(wget -O - -q -t 1 http://$HOSTNAME:8080/heartbeat)\n\n#grep $DATA for syncs and syncs_behind\nSYNCS=$(echo $DATA | grep -oE 'num_syncs: [0-9]+' | awk '{print $2}')\nSYNCS_BEHIND=$(echo $DATA | grep -oE 'num_syncs_behind: [0-9]+' | awk '{print $2}')\n\necho $SYNCS\necho $SYNCS_BEHIND\n\n#verify conditionals\nif [[ $SYNCS -gt \"8\" && $SYNCS_BEHIND -eq \"0\" ]]; then exit 0; fi\n\n#decrement the counter\nlet COUNT-=1\n\n#wait another 10 seconds\nsleep 10\n\ndone\n"}

But when I load this out using python json.loads as shown below-

jStr = json.loads(jsonString)

I am getting this error -

ERROR Invalid control character at: line 1 column 202 (char 202)

I looked at char 202 but I have no idea why that is causing an issue? char 202 in my notepad++ is e I guess.. Or may be I am calculating it wrong

Any idea what is wrong? How do I find out which one is causing problem.

UPDATE:-

jsonString = {"description":"Script to check testtbeat of TIER 1 server.", "script":"#!/bin/bash\nset -e\n\nCOUNT=60   #number of 10 second timeouts in 10 minutes\nSUM_SYNCS=0\nSUM_SYNCS_BEHIND=0\nHOSTNAME=$hostname      \n\nwhile [[ $COUNT -ge \"0\" ]]; do\n\necho $HOSTNAME\n\n#send the request, put response in variable\nDATA=$(wget -O - -q -t 1 http://$HOSTNAME:8080/heartbeat)\n\n#grep $DATA for syncs and syncs_behind\nSYNCS=$(echo $DATA | grep -oE 'num_syncs: [0-9]+' | awk '{print $2}')\nSYNCS_BEHIND=$(echo $DATA | grep -oE 'num_syncs_behind: [0-9]+' | awk '{print $2}')\n\necho $SYNCS\necho $SYNCS_BEHIND\n\n#verify conditionals\nif [[ $SYNCS -gt \"8\" && $SYNCS_BEHIND -eq \"0\" ]]; then exit 0; fi\n\n#decrement the counter\nlet COUNT-=1\n\n#wait another 10 seconds\nsleep 10\n\ndone\n"}

print jsonString[202]

Below error I got -

KeyError: 202
arsenal
  • 23,366
  • 85
  • 225
  • 331

4 Answers4

242

The control character can be allowed inside a string as follows,

json_str = json.loads(jsonString, strict=False)

You can find this in the docs for python 2, or the docs for python 3

If strict is false (True is the default), then control characters will be allowed inside strings. Control characters in this context are those with character codes in the 0–31 range, including '\t' (tab), '\n', '\r' and '\0'.

Simeon Leyzerzon
  • 18,658
  • 9
  • 54
  • 82
Joe Cheng
  • 8,804
  • 3
  • 26
  • 25
  • 2
    This worked for me as I have no influence on how the json string was formatted. – pansen Apr 04 '17 at 07:28
  • Thanks @NikhilParmar You are a life saver :) I was struggling with this problem from last 3 days and finally this one worked for me – Afraz Ahmad Jul 10 '17 at 07:57
  • I used this solution as well, but then found that later, this caused a different JSONDecodeError: Unterminated string. Ideas how to deal with this? – sFishman May 18 '20 at 07:56
89

There is no error in your json text.

You can get the error if you copy-paste the string into your Python source code as a string literal. In that case \n is interpreted as a single character (newline). You can fix it by using raw-string literals instead (r'', Use triple-quotes r'''..''' to avoid escaping "' quotes inside the string literal).

jfs
  • 399,953
  • 195
  • 994
  • 1,670
  • This answer is no different than mine. Prefixing your string with 'r' (making it a string literal) escapes your newlines. The problem lies with the OP's code. If OP truly did `print jsonString` followed by `json.loads(jsonString)`, he wouldn't have encountered this error. Otherwise, his console output would not have shown a literal \n but a newline. – Pakman Mar 27 '14 at 17:10
  • 2
    @Pakman: incorrect. My answer: *"There is no error in your json text."*. Your answer: *"Escape your newlines."* – jfs Mar 27 '14 at 17:11
  • Your statement "There is no error in your json text" is correct, but there _was_ an error in the text he was inputting into json.loads, which we both address – Pakman Mar 27 '14 at 17:14
  • yes. If you take a valid json string with `\n` in it and paste it into Python source code as a string literal then the string won't be a valid json no more – jfs Mar 27 '14 at 17:18
  • In case you don't have control over the JSON string, take a look at [Joe Cheng's answer below](https://stackoverflow.com/questions/22394235/invalid-control-character-with-python-json-loads#answer-29827074). – Renato Byrro Aug 10 '17 at 15:14
  • @RenatoByrro if you can't control a *literal* string in *your own* Python source code, you have bigger problems – jfs Sep 01 '19 at 18:45
  • @RenatoByrro if the string is not hardcoded then there is no issue with the json (read the very first sentence in the answer) – jfs Sep 02 '19 at 06:12
  • @RenatoByrro it would be a *different* question then. This answer addresses a case then a valid json text becomes invalid if copy-pasted as a Python literal string. – jfs Sep 02 '19 at 12:54
  • @RenatoByrro some people read past the title of the question but the question itself too. For those people, your comment may be misleading. – jfs Sep 06 '19 at 18:37
15

try to use "strict=False" in json.loads , it will ignore "\n" and another Control characters. like the follwing:

import json
  
test_string = ' { "key1" : "1015391654687" , "key2": "value2 \n " } '

res = json.loads(test_string, strict=False)
  
print(res)

output :

{'key1': '1015391654687', 'key2': 'value2 \n '}
K.A
  • 1,399
  • 12
  • 24
-3

Escape your newlines.

{"description":"Script to check testtbeat of TEST 1 server.", "script":"#!/bin/bash\\nset -e\\n\\nCOUNT=60   #number of 10 second timeouts in 10 minutes\\nSUM_SYNCS=0\\nSUM_SYNCS_BEHIND=0\\nHOSTNAME=$hostname      #dc1dbx1145.dc1.host.com\\n\\nwhile [[ $COUNT -ge \\"0\\" ]]; do\\n\\necho $HOSTNAME\\n\\n#send the request, put response in variable\\nDATA=$(wget -O - -q -t 1 http://$HOSTNAME:8080/heartbeat)\\n\\n#grep $DATA for syncs and syncs_behind\\nSYNCS=$(echo $DATA | grep -oE 'num_syncs: [0-9]+' | awk '{print $2}')\\nSYNCS_BEHIND=$(echo $DATA | grep -oE 'num_syncs_behind: [0-9]+' | awk '{print $2}')\\n\\necho $SYNCS\\necho $SYNCS_BEHIND\\n\\n#verify conditionals\\nif [[ $SYNCS -gt \\"8\\" && $SYNCS_BEHIND -eq \\"0\\" ]]; then exit 0; fi\\n\\n#decrement the counter\\nlet COUNT-=1\\n\\n#wait another 10 seconds\\nsleep 10\\n\\ndone\\n"}

Works for me.

Also, if you get an error like this in the future, a debugging technique you can use is to shorten the string to something that works and slowly add data until it doesn't.

Community
  • 1
  • 1
Pakman
  • 2,170
  • 3
  • 23
  • 41