0

I want to replace kw's in a dict which may have complex structure (e.g the values may be dicts or lists. Those dicts should also get their kw's replaced and the list elements may be dicts which should also get replaced.) I wrote the following

def replace_kw(obj,replace_this,with_this):
    print('object is '+str(obj))
    if isinstance(obj,dict):
        for k,v in obj.iteritems():
            if k==replace_this:
                obj[with_this]=obj[replace_this]
                del(obj[replace_this])
            else:
                obj[k] = replace_kw(obj[k],replace_this,with_this)
    elif isinstance(obj,list):
        for l in obj:
            l = replace_kw(l,replace_this,with_this)    
    return obj

which works on the simple examples I have conme up with but I am curious as to other methods and where this may go wrong. For instance I checked if a keyword can be a dictionary , and it seems the answer is no so that's one place I am not going wrong.

The example I gave was

  d = {'data': [{'bbox_xywh': [838, 533, 50, 68], 'object': 'truck'},
{'bbox_xywh': [930, 563, 60, 57], 'object': 'car'}, 
{'bbox_xywh': [993, 560, 78, 56], 'object': 'car'}, 
{'bbox_xywh': [997, 565, 57, 39], 'object': 'car'}, 
{'bbox_xywh': [1094, 542, 194, 126], 'object': 'car'}, 
{'bbox_xywh': [1311, 539, 36, 74], 'object': 'person'}], 
'dimensions_h_w_c': (1200, 1920, 3), 
'filename':'/data/jeremy/image_dbs/hls/voc_rio_udacity_kitti_insecam_shuf_no_aug_test/1478020901220540088.jpg'}

replace_kw(d,'bbox_xywh','bbox')

{'data': [{'bbox': [838, 533, 50, 68], 'object': 'truck'},
  {'bbox': [930, 563, 60, 57], 'object': 'car'},
  {'bbox': [993, 560, 78, 56], 'object': 'car'},
  {'bbox': [997, 565, 57, 39], 'object': 'car'},
  {'bbox': [1094, 542, 194, 126], 'object': 'car'},
  {'bbox': [1311, 539, 36, 74], 'object': 'person'}],
 'dimensions_h_w_c': (1200, 1920, 3),
 'filename': '/data/jeremy/image_dbs/hls/voc_rio_udacity_kitti_insecam_shuf_no_aug_test/1478020901220540088.jpg'}

which worked as expected

jeremy_rutman
  • 3,552
  • 4
  • 28
  • 47

2 Answers2

1

json

A quick-n-simple solution involves converting the entire thing to a string with json, using re.sub and then converting it back:

import json, re
json.loads(re.sub('(?<=")bbox_xywh(?=":)', 'bbox', json.dumps(d), flags=re.M))

{'data': [{'bbox': [838, 533, 50, 68], 'object': 'truck'},
  {'bbox': [930, 563, 60, 57], 'object': 'car'},
  {'bbox': [993, 560, 78, 56], 'object': 'car'},
  {'bbox': [997, 565, 57, 39], 'object': 'car'},
  {'bbox': [1094, 542, 194, 126], 'object': 'car'},
  {'bbox': [1311, 539, 36, 74], 'object': 'person'}],
 'dimensions_h_w_c': [1200, 1920, 3],
 'filename': '/data/jeremy/image_dbs/hls/voc_rio_udacity_kitti_insecam_shuf_no_aug_test/1478020901220540088.jpg'}

You may also consider using str.replace instead of regex (slightly faster):

json.loads(json.dumps(d).replace('"bbox_xywh":', '"bbox":'))

Have faith in json that it stringifies your data consistently. You can handle dictionarys of arbitrary structure like this.

This fails when your data isn't JSON compliant - if you have other python objects besides lists and dicts or custom class objects, this no longer works.


literal_eval

Here's another method with ast.literal_eval to overcome the problem mentioned above:

import ast
ast.literal_eval(str(d).replace('\'bbox_xywh\':', '\'bbox\':'))

While it does not coerce tuples to lists, I'm not a fan of this because the quotes have to be treated very carefully.

cs95
  • 379,657
  • 97
  • 704
  • 746
  • i think that'll replace any string in the object - I am interested in replacing only keywords, on the off chance that that word may also occur 'innocently' somewhere else in the object – jeremy_rutman Jul 22 '17 at 11:42
  • @jeremy_rutman Look carefully. It does not. It checks for the quotes and colons surrounding the string. – cs95 Jul 22 '17 at 11:46
  • @jeremy_rutman Also note that `dimensions_h_w_c` was converted from a tuple to a list. My third solution fixes that. – cs95 Jul 22 '17 at 11:47
  • user2291982 seems to also have given a good answer, can i accept both? I don't have the background to judge one as better – jeremy_rutman Jul 22 '17 at 12:16
  • @jeremy_rutman You can only accept _one_, so accept the one that helped you most. By the way, `eval` is dangerous and you must _never_ use it. Read: https://nedbatchelder.com/blog/201206/eval_really_is_dangerous.html – cs95 Jul 22 '17 at 12:19
0
def rep_str(obj, replace_this, with_this):
    if isinstance(obj, str):
        return obj.replace(replace_this, with_this)
    return obj

def change(obj, replace_this, with_this):
    if isinstance(obj, list):
        return [change(x, replace_this, with_this) for x in obj]
    if isinstance(obj, dict):
        return {rep_str(k, replace_this, with_this): 
            change(v, replace_this, with_this) for k, v in obj.items()}
    return obj

change(obj, replace_this, with_this)