5

I have the following list of nested dictionaries and lists. I want to create a new parent (L0) category called 'food', and shift all the values in the fruit and vegs dictionaries one step down (so that 'L0': 'fruit' becomes 'L1': 'fruit', 'L1': 'banana' becomes 'L2': 'banana', etc).

D = [{
        "L0": "fruit",
        "L1_list": [
            {
                "L1": "banana"
            },
            {
                "L1": "apple", 
                "L2_list": [
                    {
                        "L2": "Green apple"
                    }, 
                    {
                        "L2": "Red apple"
                    }
                ]
            }, 
            {
                "L1": "kiwi"
            }
        ]
    },
    {
        "L0": "vegs", 
        "L1_list": [
            {
                "L1": "potato"
            }, 
            {
                "L1": "carrot"
            }
        ]
    }]

The excepted output should look like this:

Expected_output = [
    {
        "L0": "food",
        "L1_list": [
            {
                "L1": "fruit",
                "L2_list": [
                    {
                        "L2": "banana"
                    },
                    {
                        "L2": "apple",
                        "L3_list": [
                            {
                                "L3": "Green apple"
                            },
                            {
                                "L3": "Redapple"
                            }
                        ]
                    },
                    {
                        "L2": "kiwi"
                    }
                ]
            },
            {
                "L1": "vegs",
                "L2_list": [
                    {
                        "L2": "potato"
                    },
                    {
                        "L2": "carrot"
                    }
                ]
            }
        ]
    }
]

Now, because my dictionaries can vary in size and how deep they can go, I need a programmatic solution. So I thought I would create a recursive function that iterates util it reaches the end of the tree. As the function reaches the end of a particular branch, it would add 1 to the key (L0 --> L1, L1_list --> L2_list). Although the process does indeed shift everything one level down, I can't figure out how to rebuild the initial structure. In particular, I can't bring the children back into their respective list.

Final_list = []
def digger(list_to_dig):
    import re
    for x in list_to_dig:
        for k,v in x.items():
            if isinstance(v, list):
                print("keep digging")
                digger(v)
            elif isinstance(x, dict):
                new_D = {}
                new_k = "L" + str(int(re.sub("L", "", k)) + 1)
                new_D[new_k] = v
                temp = re.sub("L", "", k)
                new_child_list = "L" + str(int(re.sub("_list", "", temp)) + 2) + "_list"
                new_D[new_child_list] = ""
                Final_list.append(new_D)
            else:
                print("no dictionary avail")
                pass
    print("_________")
    print(Final_list)
    print("_________")

    test = digger(D)

Any suggestions on how I should tackle this? Many thanks

Following the suggestion of @running.t I have tried to use the dict.pop method. However, because it takes place within an iteration, it pops the old key, creates and inserts the new one, but on the next iteration will take the new key just created, pops it, and creates and inserts a new new key, and so on (although it doesn't go into an infinite loop either).

Here is a simplified example to illustrate the problem:

Step 1 create new top level dict

new_top_level = {"L0": "Food"}
new_dict = {}
for k, v in new_top_level.items():
    lst_k = "L" + str(int(re.sub("L", "", ka)) + 1) + "_list"
    new_dict[k] = v
    new_dict[lst_k] = []

Step 2 add the old tree in the new list

old_d = {'L0': 'Fruit', 'L1_list': [{'L1': 'Green apple'}, {'L1': 'Red apple'}]}
new_dict[lst_k].append(old_d)

Step 3 add 1 to all the keys of the old tree

def digger(list_to_update):
    import re
    pattern1 = r"L.$"
    pattern2 = r"L._list"
    for x in list_to_update:
        for k1, v1 in x.items():
            if re.match(pattern1, k1):
                new_k1 = "L" + str(int(re.sub("L", "", k1)) + 1)
                x[new_k1] = x.pop(k1)
            elif re.match(pattern2, k1):
                temp = re.sub("L", "", k1)
                new_k1 = "L" + str(int(re.sub("_list", "", temp)) + 1) + "_list"
                x[new_k1] = x.pop(k1)
                digger(v1)

test = digger(new_dict[lst_k])
Mad Physicist
  • 107,652
  • 25
  • 181
  • 264
Intel_code
  • 53
  • 4

2 Answers2

0

You should not create new list and put everything into it. And actually this is what you're doing in line:

Final_list.append(new_D)

What you should do instead is to recursively iterate all dictionaries and lists you have (the same way you do it currently) and if object is a dict rename all keys in that dict accordingly.

Here you can find how to rename keys i dict. I think the best answer there suggest using following:

new_k = "L"+str(int(re.sub("L","",k))+1) 
x[new_key] = x.pop(k)

And finally, after finishing digging all D your should put modified D inside new Expected_output list.

running.t
  • 5,329
  • 3
  • 32
  • 50
  • Thanks! very useful. I am not familiar with the pop function but will look into it and let you know how it goes – Intel_code Apr 11 '18 at 12:45
0

A year late, I know, but let's do a quick analysis of the problem in prose. You have a dictionary. The dictionary can have two types of keys: L* and L*_list. In both cases * is an integer. L* will always have a string value. L*_list will always have a list-of-dictionaries value. Your goal is to recursively increment the integers in the key names.

Clearly something like that lends itself well to recursion. You recurse into each element of a L*_list value. The recursion ends when you get a list of dictionaries that do not have L*_list keys. In that case, you only increment the L* keys and return. Up to this point, we completely agree, since everything I have said is already in the question.

To answer the actual question, we only need to make one change: the recursive function needs to either modify the nested objects in-place, or return a new replacement object. It is simpler to construct an entirely new data structure than to modify the existing dictionaries in place because it makes the iteration easier (which you also noticed).

There is a bit of a special case at the top level, since you want to push everything into a new food category. This is not a problem, since the recursive solution will return the value for the new L1_list key.

Here is a simple sample implementation:

def increment_keys(d):
    def process_key(key, value):
        key = f'L{int(key[1:]) + 1}'
        return key, value

    def process_list(key, value):
        key = f'L{int(key[1:-5]) + 1}_list'
        value = [increment_keys(d) for d in value]
        return key, value

    def process(key, value):
        if key.endswith('_list'):
            return process_list(key, value)
        return process_key(key, value)

    return dict(process(key, value) for key, value in d.items())

expected_output = [{'L0': 'food', **increment_keys({'L0_list': D})}]

You can absorb the nested process function into the generator that feeds into the return value of increment_keys by using a ternary operator. I don't think it improves readability any, but it would save you about four lines:

return dict(process_list(k, v) if k.endswith('_list') else process_key(k, v) for k, v in d)

Now if you absolutely had to do this in-place, the best way would be to freeze the keys of each dictionary before iterating. If you iterate over the frozen keys, pop and __setitem__ will not cause you any problems.

Since you would never get duplicates between original and incremented keys at a given level, you don't have to pay any special attention to losing prior values (e.g., if you had L1 and L2 in the same dict, and incremented L1 first.

Here is a sample in-place recursion:

def increment_keys(obj):
    def process(d):
        for key in list(d.keys()):
            value = d.pop(key)
            if key.endswith('_list'):
                key = f'L{int(key[1:-5]) + 1}_list'
                increment_keys(value)
            else:
                key = f'L{int(key[1:]) + 1}'
            d[key] = value

    for d in obj:
        process(d)

increment_keys(D)
expected_output = [{'L0': 'food', 'L1_list': D}]

In keeping with Python convention, I did not return anything from the in-place function.

Mad Physicist
  • 107,652
  • 25
  • 181
  • 264