0

Currently, I'm creating two lists and comparing them for duplicates.
Instead of that, I want to recursively remove nested items from the dictionary

The question I have is, how do I select a deeply nested item AND change the dictionary while performing this recursion?

Current function:

def _finditem(obj,key):
    if key == 'haha' or key == 'haha1':
        global a_list
        global b_list
    if isinstance(obj,dict):
        _finditem(obj['children'],key)
    else:
        for x in obj:
            if x['title'] == 'Bookmarks Menu':
                _finditem(x['children'],'haha')
            elif x['title'] == 'surf':
                _finditem(x['children'],'haha1')
            else:
                try:
                    _finditem(x['children'],key)
                except:
                    if key == 'haha':
                        a_list.append(x['title'])
                    elif key == 'haha1':
                        b_list.append(x['title'])
                    pass
user193661
  • 879
  • 10
  • 29
  • I'm not really sure what you're trying to do. It looks like you're currently appending to a list not removing from a dictionary? – Adam Smith May 19 '14 at 20:41
  • `del dct[key]` will delete the value for a key, which has nothing to do with how deeply anything is nested. You want to use iterators (as you would do in other languages as well) to delete items while iterating over a dictionary (see e.g. http://stackoverflow.com/questions/5384914/deleting-items-from-a-dictionary-while-iterating-over-it) – Oleg Sklyar May 19 '14 at 21:19

1 Answers1

0

Modifying a list while iterating over it:

I used a list slice and excluded items from the original based on a test function.

def discard_func(list_of_dicts):
    list_of_dicts[:] = [x for x in list_of_dicts if test(x)]
    return list_of_dicts

list[:] is list slice syntax for entire list
Explain Python's slice notation
Remove items from a list while iterating

Scope:

The list slice also solves that, since it modifies the original object. However, I added returns to each function and assignments to each recursive function call anyway. So that the variable being assigned to, gets the returned value from the function, however deep it may go.

def _finditem(op_type, obj):
    if isinstance(obj, dict):
        obj['children'] = _finditem(op_type, obj['children'])
    else:
        for x in obj:
            if x['title'] in subjects[op_type]:
                x['children'] = operations[op_type](x['children'])
            else:
                try:
                    x['children'] = _finditem(op_type, x['children'])
                except:
                    continue
    return obj

entire file:

assign_var = {'compare' : None , 'discard' : [] , 'keep' : [] , 'pairs' : None}
subjects = {'keep' : ['Bookmarks Menu'] , 'discard' : ['surf'] , 'compare' : None , 'pairs' : [ {'keep' : ['Bookmarks Menu'] , 'discard' : ['surf']} , {'keep':'bla','discard':'etc'} ] }
def test(y):
    if 'children' in y:
        if y['title'] not in subjects['keep']:
            discard_func(y['children'])
        else:
            pass
    elif y['title'] in assign_var['keep']:
        print 'Dupicate found'
        return False
    return True
def discard_func(loo):
    loo[:] = [x for x in loo if test(x)]
    return loo
def keep_func(x):
    global assign_var
    for y in x:
        if 'children' in y:
            if y['title'] not in subjects['discard']:
                keep_func(y['children'])
            else:
                continue
        else:
            assign_var['keep'].append(y['title'])
    return x
def _finditem(op_type, obj):
    if isinstance(obj, dict):
        obj['children'] = _finditem(op_type, obj['children'])
    else:
        for x in obj:
            if x['title'] in subjects[op_type]:
                x['children'] = operations[op_type](x['children'])
            else:
                try:
                    x['children'] = _finditem(op_type, x['children'])
                except:
                    continue
    return obj
operations = { 'keep' : keep_func , 'discard' : discard_func , 'pairs' : None , 'compare' : None }
def parent_function():
    op_type = 'keep'
    _finditem(op_type, book)
    op_type = 'discard'
    book_new = _finditem(op_type, book)
    # for op_type in assign_var:
    #   try:
    #       _finditem(op_type = op_type, obj = book)
    #   except:
    #       pass
    return book_new
if __name__ == '__main__':
    print 'In __main__'
    import json
    loc = 'C:\\workspace\\temp\\'
    with open(loc + 'bookmarks-2014-05-24.json', 'r') as nctn:
        book = json.load(nctn)
    book_new = parent_function()
    with open(loc + 'bookmarks-2014-05-24_new.json', 'w') as nctn:
        nctn.write(json.dumps(book_new))
Community
  • 1
  • 1
user193661
  • 879
  • 10
  • 29