0

I am loading some data into Python from an external json file, which for the most part works fine. However occasionally some of the data contains escaped characters, which I need to retain.

raw input file:

{
    "version": 1,
    "query": "occasionally I \"need\" to escape \"double\" quotes"
} 

loading it into python:

import json

with open('input_file', 'r') as f:
    file = json.load(f)

Edit

Apologies I should be clearer. What I am trying to do is something like the following:

'{}'.format(file['query'])

Using json.dumps

actual_query = '"datadog.agent.up".over("role:dns").by("host").last(1).count_by_status()'

json.dumps(actual_query)
'"\\"datadog.agent.up\\".over(\\"role:dns\\").by(\\"host\\").last(1).count_by_status()"'
grinferno
  • 524
  • 8
  • 23
  • your input file seems to give the correct result for me in python 3.6 , what version are you using? – Anand S Kumar Aug 08 '17 at 11:14
  • @AnandSKumar 3.5.2 So when you load it retains the escape chars? – grinferno Aug 08 '17 at 11:16
  • what are you trying to do? if you want to get the json formatting escapes back you can use `json.dumps(file)` – Stael Aug 08 '17 at 11:17
  • 2
    I'm not quite sure what you're after... the parsed text correctly doesn't require the backslashes anymore. They'll be correctly escaped if you need to output it again... – Jon Clements Aug 08 '17 at 11:19
  • Check out the answer to this question: https://stackoverflow.com/questions/9295439/python-json-loads-fails-with-valueerror-invalid-control-character-at-line-1-c – Dmytro Chekunov Aug 08 '17 at 11:21
  • @all clarified question – grinferno Aug 08 '17 at 11:49
  • @MaxWeaver :S I worry that you're not understanding something, but I can't tell what because I don't know why you want to do `'{}'.format(file['query'])` - i've updated my answer though and as far as I can see it gives you exactly what you want. – Stael Aug 08 '17 at 12:09
  • @stael I want to generate a string using a value in a json object. – grinferno Aug 08 '17 at 12:13
  • @MaxWeaver you already have a string that is the value of the json object. `json.load(file)` converted the text of the json into a python dictionary where the key `'query'` corresponds to the string object `'occasionally I "need" to escape "double" quotes'` (in python you don't need to escape `"` if your string is bound by `'`. – Stael Aug 08 '17 at 12:16
  • @Stael erm, to be clearer, I want to take certain values from a json object, and write them to file in a specific format (HCL format). for example `'name = {:>2}'.format(file["name"])`. Perhaps I'm going about this all wrong though – grinferno Aug 08 '17 at 14:13
  • @MaxWeaver ok! that makes more sense. I take it you have written the strings to file as they are and it doesn't work? Does my use of `json.dumps` work for you now? The other option might be to look into hcl packages for python. I've never come across them but it looks like they do exist. eg http://www.virtualroadside.com/blog/index.php/2014/10/15/introducing-pyhcl/ – Stael Aug 08 '17 at 14:29
  • @MaxWeaver what I mean by that is that pyhcl should contain it's own routines for correctly escaping characters in order to convert valid python objects (such as strings) into valid hcl, which means you don't have to worry about any of this. – Stael Aug 08 '17 at 14:31
  • @Stael if the documentation is anything to go by you can only convert HCL to json. I need to convert json to HCL. Thanks for your help btw – grinferno Aug 08 '17 at 14:43
  • @MaxWeaver yeah, I'm surprised there isn't a `dump` in there. Does my answer work for you? – Stael Aug 08 '17 at 14:49
  • @Stael it almost works. It retains the escape chars but puts the string in double double quotes like so: `"\"\"string\"with\"quotes""` so it's invalid HCL. – grinferno Aug 08 '17 at 15:06
  • @MaxWeaver how are you writing it? it works for me if use `.write(json.dumps(file['query']))` – Stael Aug 08 '17 at 15:12
  • @stael hey, see latest edit :) I realise by trying to use more simplistic data as an example I'm probably confusing people. I am using the actual query I need to write. – grinferno Aug 08 '17 at 15:20
  • @MaxWeaver that looks like you would want it to, isn't it? when you print that, or write it to a file it will come out exactly as you were asking. – Stael Aug 08 '17 at 15:40
  • @Stael it's invalid HCL though. It should look like: `"\\"datadog.agent.up\\".over(\\"role:dns\\").by(\\"host\\").last(1).count_by_status()"`. I'll keep playing. Luckily only a handful of the queries are like this so I can change by hand for the time being. – grinferno Aug 08 '17 at 16:22
  • @MaxWeaver you need double `\\` ? – Stael Aug 08 '17 at 16:26

2 Answers2

2

This is exactly what you should be expecting, and I'm not sure why it isn't what you want. Remember that print commands return the representation of a variable, eg print('\"') gives ".

using your example, you can see how you would get the escape characters back when outputting your results:

import json

a = r"""{
    "version": 1,
    "query": "occasionally I \"need\" to escape \"double\" quotes"
}"""

j = json.loads(a)


print j

print json.dumps(j)

which gives me:

{u'query': u'occasionally I "need" to escape "double" quotes', u'version': 1}
{"query": "occasionally I \"need\" to escape \"double\" quotes", "version": 1}

(if you'll excuse the python2)


In repsonse to your edit:

'{}'.format(file['query']) == file['query'] returns True - you're formatting a string object as a string. As I have suggested, using

json.dumps(file['query'])

returns

"occasionally I \"need\" to escape \"double\" quotes"

which by the way is the string:

'"occasionally I \\"need\\" to escape \\"double\\" quotes"'

this is the case also for your 'actual query':

query = '"\\"datadog.agent.up\\".over(\\"role:dns\\").by(\\"host\\").last(1).count_by_status()"'

gives

print json.dumps(query)
# "\"datadog.agent.up\".over(\"role:dns\").by(\"host\").last(1).count_by_status()"


with open('myfile.txt', 'w') as f:
    f.write(json.dumps(query))


# file contents:
# "\"datadog.agent.up\".over(\"role:dns\").by(\"host\").last(1).count_by_status()"

double \\:

see, this is why you need to be explicit about what you're actually trying to do.

a trick for doubling \ is to put in a repr()

eg:

print repr(json.dumps(query))[1:-1] # to remove the ' from the beginning and end

# "\\"datadog.agent.up\\".over(\\"role:dns\\").by(\\"host\\").last(1).count_by_status()"

with open('myfile.txt', 'w') as f:
    f.write(repr(json.dumps(actual_query))[1:-1])

# file:
# "\\"datadog.agent.up\\".over(\\"role:dns\\").by(\\"host\\").last(1).count_by_status()"

you could also do a .replace(r'\', r'\\') on it

Stael
  • 2,619
  • 15
  • 19
0

When I run your program, the json I get looks a little different. You have single quotes around the second line in your output. I don't get that.

Anyway. While the single quotes solve the escape problem, it is not valid Json. Valid Json needs double quotes. Single quotes are just a string delimiter in Python.

Replace the last line in your code with print(json.dumps(file))

And proper json is returned. { "query": "occasionally I \"need\" to escape \"double\" quotes", "version": 1 }

Regards,

Melle

melle
  • 26
  • 5