2

In Python, how can I convert a string object that looks like this

"{
 apartment=false, 
 walls=[{min_height=18, max_height=3, color=WHITE}], 
 appliances=[{type=[oven, washing_machine, microwave, drying_machine, 
   dish_washer, television]}],
 rooms=[{bathroom=true, floor=2}, {bedroom=true, floor=[2,3], needs_renovation=EXCLUDE}], 
 value=[{sale_price=9003.01, occupied=true, family_unit=UNKNOWN}]
}"

To a dictionary object like this?

{
 "apartment": False, 
 "walls": [{"min_height": 18, "max_height": 3, "color": "WHITE"}], 
 "appliances": [{"type": ["oven", "washing_machine", "microwave", "drying_machine", 
   "dish_washer", "television"]}],
 "rooms": [{"bathroom": True, "floor": 2}, {"bedroom": True, "floor":[2,3], "needs_renovation": "EXCLUDE"}], 
 "value": [{"sale_price": 9003.01, "occupied": True, "family_unit": "UNKNOWN"}]
}

I was using Simple way to convert a string to a dictionary, but it didn't get me far enough because I couldn't handle the nested dictionaries and lists.

Kevin
  • 97
  • 1
  • 6

1 Answers1

1

Use regex and normal string substitution, along with the json package:

import json
from pprint import pprint

string = '''{
 apartment=false, 
 walls=[{min_height=18, max_height=3, color=WHITE}], 
 appliances=[{type=[oven, washing_machine, microwave, drying_machine, 
   dish_washer, television]}],
 rooms=[{bathroom=true, floor=2}, {bedroom=true, floor=[2,3], needs_renovation=EXCLUDE}], 
 value=[{sale_price=9003.01, occupied=true, family_unit=UNKNOWN}]
}'''

processed = re.sub(r'([A-Za-z_]+)', r'"\1"', string.replace('\n', '')).replace('=', ':').replace('"true"', 'true').replace('"false"', 'false')

pprint(json.loads(processed))

Output:

{'apartment': False,
 'appliances': [{'type': ['oven',
                          'washing_machine',
                          'microwave',
                          'drying_machine',
                          'dish_washer',
                          'television']}],
 'rooms': [{'bathroom': True, 'floor': 2},
           {'bedroom': True, 'floor': [2, 3], 'needs_renovation': 'EXCLUDE'}],
 'value': [{'family_unit': 'UNKNOWN', 'occupied': True, 'sale_price': 9003.01}],
 'walls': [{'color': 'WHITE', 'max_height': 3, 'min_height': 18}]}
gmds
  • 19,325
  • 4
  • 32
  • 58
  • This is so good! In my data, other dictionary values possibilities include: "abc_123" and "123+" and "321-456". How can I update the regex to treat these as string and keep the other purely integer and float types as such? – Kevin Apr 24 '19 at 05:48
  • @Kevin You could change the first regex to use `\w` and add on another `re.sub` call to remove quote marks from only the numbers, I think? – gmds Apr 24 '19 at 05:50