1
Print(Json_text)
'{\n                    \n                        acorn: "",\n                        acorn_type: "",\n                        area_name: "Glasgow",\n                        beds_max: 2,\n                        beds_min: 2,\n }

I tried to solve it doing:

json_text = re.sub(r'\n', '',json_text)
json_text = re.sub(r' ', '',json_text)

then the results:

print(json_text)
'{acorn:"",acorn_type:"",area_name:"Glasgow",beds_max:2,beds_min:2,branch_id:"32896"}

then I tried to convert to Json Format:

json_text= json.dumps(json_text)
json_text = json.loads(json_text)

But the final value is an string.

json_text['area_name']
TypeError: string indices must be integers

i think is because the Key values don't have quotations ("")

cgomezfandino
  • 55
  • 1
  • 7
  • Put quotes around the keys (ie “acorn” and not acorn –  Jul 27 '19 at 13:55
  • 1
    It's not valid JSON, the string keys are unquoted by your approach – roganjosh Jul 27 '19 at 13:56
  • how did you get `Json_text` ? It is not correct JSON. If you generate this JSON then change method - use module `json` for this. – furas Jul 27 '19 at 13:57
  • that's not a valid json, of course you couldnt parse that – basilisk Jul 27 '19 at 13:58
  • you can try module [dirtyjson](https://github.com/codecobblers/dirtyjson) - it can handle some problems in wrong JSON. – furas Jul 27 '19 at 13:59
  • here is the previous question: https://stackoverflow.com/questions/57232015/get-info-from-script-tag-webscrap?noredirect=1#comment100967012_57232015 – cgomezfandino Jul 27 '19 at 14:02
  • Can you specify what you are trying to achieve? Your code seems inconsistent in some ways, e.g. 1) `json.dumps` takes a Python object (eg dictionary) as input and turns it into a string, but you give it a string; 2) it's not clear to me why you would manuipulate your string before writing to JSON, etc. It would be good to know 1. Where is your input from? 2. Why are you converting to JSON? – patrick Jul 27 '19 at 14:03

1 Answers1

2

You need to do replacements to make it json parse-able:

In [120]: text = '{\n                    \n                        acorn: "",\n                        acorn_type: "",\n                        area_name: "Glasgow",\n                        beds_max: 2,\
     ...: n                        beds_min: 2,\n }'                                                                                                                                                        

In [121]: json.loads(re.sub(r'\b([^:",]+)(?=:)', r'"\1"', re.sub(r'\s*|,\s*(?=\}$)', '', text)))                                                                                                            
Out[121]: 
{'acorn': '',
 'acorn_type': '',
 'area_name': 'Glasgow',
 'beds_max': 2,
 'beds_min': 2}

At first, we need to drop all whitespaces and the trailing ,:

In [122]: re.sub(r'\s*|,\s*(?=\}$)', '', text)                                                                                                                                                              
Out[122]: '{acorn:"",acorn_type:"",area_name:"Glasgow",beds_max:2,beds_min:2}'

Now, on the returned string, we need to add double quotes to the keys:

In [123]: re.sub(r'\b([^:",]+)(?=:)', r'"\1"', re.sub(r'\s*|,\s*(?=\}$)', '', text))                                                                                                                        
Out[123]: '{"acorn":"","acorn_type":"","area_name":"Glasgow","beds_max":2,"beds_min":2}'

Now, json.loads would do:

In [124]: json.loads(re.sub(r'\b([^:",]+)(?=:)', r'"\1"', re.sub(r'\s*|,\s*(?=\}$)', '', text)))                                                                                                            
Out[124]: 
{'acorn': '',
 'acorn_type': '',
 'area_name': 'Glasgow',
 'beds_max': 2,
 'beds_min': 2}

Using names:

In [125]: text                                                                                                                                                                                              
Out[125]: '{\n                    \n                        acorn: "",\n                        acorn_type: "",\n                        area_name: "Glasgow",\n                        beds_max: 2,\n                        beds_min: 2,\n }'

In [126]: text_wo_whitespaces = re.sub(r'\s*|,\s*(?=\}$)', '', text)                                                                                                                                        

In [127]: text_quoted = re.sub(r'\b([^:",]+)(?=:)', r'"\1"', text_wo_whitespaces)                                                                                                                           

In [128]: json.loads(text_quoted)                                                                                                                                                                           
Out[128]: 
{'acorn': '',
 'acorn_type': '',
 'area_name': 'Glasgow',
 'beds_max': 2,
 'beds_min': 2}
heemayl
  • 39,294
  • 7
  • 70
  • 76