I've seen that there are a fair few questions addressing more or less this issue, but I've not managed to apply them to my specific use-case, and I've been scratching my head and trying different solutions for a couple of days now.
I have a list of dictionaries, with their hierarchical position encoded as a string of index numbers - I want to rearrange the dictionaries into a nested hierarchy using these indices.
Here's some example data:
my_data = [{'id':1, 'text':'one', 'path':'1'},
{'id':2, 'text':'two', 'path':'3.1'},
{'id':3, 'text':'three', 'path':'2.1.1'},
{'id':4, 'text':'four', 'path':'3.2.1'},
{'id':5, 'text':'five', 'path':'2.1.2'},
{'id':6, 'text':'six', 'path':'3.2.2'},
{'id':7, 'text':'seven', 'path':'2'},
{'id':8, 'text':'eight', 'path':'3'},
{'id':9, 'text':'nine', 'path':'3.2'},
{'id':10, 'text':'ten', 'path':'2.1'}]
and here's what I'm trying to achieve:
result = {1:{'id':1, 'text':'one', 'path':'1'},
2:{'id':7, 'text':'seven', 'path':'2', 'children':{
1:{'id':10, 'text':'ten', 'path':'2.1', 'children':{
1:{'id':3, 'text':'three', 'path':'2.1.1'},
2:{'id':5, 'text':'five', 'path':'2.1.2'}
}}}},
3:{'id':8, 'text':'eight', 'path':'3', 'children':{
1:{'id':2, 'text':'two', 'path':'3.1'},
2:{'id':9, 'text':'nine', 'path':'3.2', 'children':{
1:{'id':4, 'text':'four', 'path':'3.2.1'},
2:{'id':6, 'text':'six', 'path':'3.2.2'}
}}}}
}
Since the paths of the individual data dictionaries don't appear in any logical order, I'm using dictionaries throughout rather than lists of dictionaries, as this allows me to create 'empty' spaces in the structure. I don't really want to rely on re-ordering the dictionaries in the initial list.
Here's my code:
#%%
class my_dict(dict):
def rec_update(self, index, dictObj): # extend the dict class with recursive update function
"""
Parameters
----------
index : list
path to dictObj.
dictObj : dict
data object.
Returns: updates the dictionary instance
-------
None.
"""
pos = index[0]
index.pop(0)
if len(index) != 0:
self.update({pos : {'children' : {self.rec_update(index, dictObj)}}})
else:
self.update({pos : dictObj})
#%%
dataOut = my_dict() #create empty dictionary to receive result
dataOut.clear()
# dictObj = my_data[0] # for testing
# dictObj = my_data[1]
for dictObj in my_data:
index = dictObj.get('path').split(".") # create the path list
dataOut.rec_update(index, dictObj) # place the current data dictionary in the hierarchy
The issue with the code is that the result of the nested function call in the class definition self.rec_update(index, dictObj)
isn't ending up as the value of the 'children' key. Is this because I've not understood the scope of self
properly?
I've noticed during testing that, if I run the dataOut.rec_update(index, dictObj)
call for a single element of my_data
, e.g. dictObj = my_data[1]
, that the index list variable in the console scope is modified, which is unexpected, as I thought the rec_update()
function had its own distinct scope.
I think I can see a further bug where the 'children' element will be overwritten, but I'm not at that stage yet.
I'd welcome any explanation that can put me on the right track, please.