0

Python Escape Double quote character and convert the string to json

I have tried escaping double quotes with escape characters but that didn't worked either

raw_string = '[{"Attribute":"color","Keywords":"green","AttributeComments":null},{"Attribute":" season","Keywords":["Holly Berry"],"AttributeComments":null},{"Attribute":" size","Keywords":"20"x30"","AttributeComments":null},{"Attribute":" unit","Keywords":"1","AttributeComments":null}]'

new_data = json.loads(raw_string)

It load errors saying Expecting ',' delimiter: line 1 column 180 (char 179)

The expected output is JSON string

Shubham
  • 57
  • 1
  • 13
  • 1
    Bad value here ---> `:"20"x30""`, you need to fix that – RomanPerekhrest Aug 30 '19 at 09:46
  • There is a formatting error here: `"Keywords":"20"x30""`, change it to `"Keywords":"20x30"` for example – olinox14 Aug 30 '19 at 09:47
  • @RomanPerekhrest Thank you for your reply but that's data I'm fetching from the database – Shubham Aug 30 '19 at 09:56
  • Your python string `raw_string` is a valid string, but is not valid `json`. You need to get the string fixed first. How did you get hold of that string? Why do you think it might be `json`? – quamrana Aug 30 '19 at 10:08
  • Possible duplicate of [How do I automatically fix an invalid JSON string?](https://stackoverflow.com/questions/18514910/how-do-i-automatically-fix-an-invalid-json-string) – quamrana Aug 30 '19 at 10:25

3 Answers3

2

The correct JSON string, with escaped quotes should look like this:

[{
    "Attribute": "color",
    "Keywords": "green",
    "AttributeComments": null
}, {
    "Attribute": " season",
    "Keywords": ["Holly Berry"],
    "AttributeComments": null
}, {
    "Attribute": " size",
    "Keywords": "20\"x30",
    "AttributeComments": null
}, {
    "Attribute": " unit",
    "Keywords": "1",
    "AttributeComments": null
}]

Edit: You can use a regular expression to correct the sting in Python resulting in a valid json:

import re
import json

raw_string = '[{"Attribute":"color","Keywords":"green","AttributeComments":null},{"Attribute":" season","Keywords":["Holly Berry"],"AttributeComments":null},{"Attribute":" size","Keywords":"20"x30"","AttributeComments":null},{"Attribute":" unit","Keywords":"1","AttributeComments":null}]'

pattern = r'"Keywords":"([\d].)"x([\d].)""'
correctedString = re.sub(pattern, '"Keywords": "\g<1>x\g<2>"', raw_string)
print(json.loads(correctedString))

Output:

[{u'Keywords': u'green', u'Attribute': u'color', u'AttributeComments': None}, {u'Keywords': [u'Holly Berry'], u'Attribute': u' season', u'AttributeComments': None}, {u'Keywords': u'20x30', u'Attribute': u' size', u'AttributeComments': None}, {u'Keywords': u'1', u'Attribute': u' unit', u'AttributeComments': None}]
Maurice Meyer
  • 17,279
  • 4
  • 30
  • 47
1
raw_string = '[{"Attribute":"color","Keywords":"green","AttributeComments":null},{"Attribute":" season","Keywords":["Holly Berry"],"AttributeComments":null},{"Attribute":" size","Keywords":"20x30","AttributeComments":null},{"Attribute":" unit","Keywords":"1","AttributeComments":null}]'

new_data = json.loads(raw_string)
  • Thank you for your reply I want to convert string to python dictionary – Shubham Aug 30 '19 at 09:54
  • I've edited my code. Try it out now. There was an error with the Keywords value which needed to be corrected. – Harshvardhan Arora Aug 30 '19 at 09:56
  • Thank you for your reply i know problem is within `"Keywords":"20"x30""` but that's what i don't know how to rectify it – Shubham Aug 30 '19 at 09:59
  • Could you explain what you mean by that? – Harshvardhan Arora Aug 30 '19 at 10:03
  • As you can see raw_string value is:- ```raw_string = '[{"Attribute":"color","Keywords":"green","AttributeComments":null},{"Attribute":" season","Keywords":["Holly Berry"],"AttributeComments":null},{"Attribute":" size","Keywords":"20"x30"","AttributeComments":null},{"Attribute":" unit","Keywords":"1","AttributeComments":null}]'``` which i want to convert to json – Shubham Aug 30 '19 at 10:05
  • 1
    I mean, which database are you fetching this from? And how is the database being populated? The error is in the procedure through which you are populating the database. – Harshvardhan Arora Aug 30 '19 at 10:07
  • It's been populated in postgresql database – Shubham Aug 30 '19 at 10:12
1

First of all change the key-value pair : "Keywords":"20"x30"" to "Keywords":"20x30". The formatting is invalid in your code. If this JSON is not made by you or generated by some other source, check the source. You can check if the JSON is valid or not using JSONLint. Just paste your JSON here to check.

As for your code:

import json

raw_string = '[{"Attribute":"color","Keywords":"green","AttributeComments":null},{"Attribute":" season","Keywords":["Holly Berry"],"AttributeComments":null},{"Attribute":" size","Keywords":"20x30","AttributeComments":null},{"Attribute":" unit","Keywords":"1","AttributeComments":null}]'    
new_data = json.loads(raw_string)

Since new_data is a list. If you check the type of its first and only element, using print(type(new_data[0])) you'll find it is a dict that you desired.

EDIT: Since you say you are fetching this JSON from a database, check if the JSONs there are all carrying these type of formatting errors. If yes, you'd want to check where these are JSONs being generated. Your options are either to correct it at the source and correct it manually or adding escape characters, if this is a one-off problem. I strongly suggest the former.

akshayks
  • 199
  • 9
  • Good answer, but how do you know that `"Keywords":"20x30"` is correct? If it were `"Keywords":"20'x30'"` wouldn't that also be correct? – quamrana Aug 30 '19 at 10:10
  • @akshayks I know its not valid json I want to proper format the string so that it can convert to json – Shubham Aug 30 '19 at 10:10
  • @Shubham: There is no way of finding this and correcting it. It should have been corrected before you get to see it. – quamrana Aug 30 '19 at 10:13
  • @quamrana Yes, I assumed that it is so. You may be right, that is why I asked OP to check the source if that is indeed the format he wants. – akshayks Aug 30 '19 at 10:14