2

I've been struggling for a while trying to figure it out to put some hierarchical values in a flat table into a specific dictionary format. The main issue is that I couldn't figure it out how to nest each category inside their corresponding key.

I have this table (as a pandas DataFrame) with the column stating the hierarchy as a number: The table has three columns:

Level    Name           Description
  0      Main               ...
  1      Sub main           ...
  2      Sub sub main       ...
  1      Sub main           ...
  2      Sub sub main       ...
  3      Sub sub sub main   ...
  0      Main_2             ...
       .  .  .

And the expected output should be something like this:

{
    "nodes": [
        {
            "name": "main",
            "description": "",
            "owners":{
                "users":["Sandra"]
            },
            "terms":[{
                "name":"",
                "description":""
            }]
            
        },
        
        {
            "nodes": [
                {
                    "name": "sub_main",
                    "description": "",
                    "owners":{
                        "users":[""]
                    },
                    "terms":[{
                        "name":"",
                        "description":"",
                        "inherits":[""]
                    }]

                },
                {
                    "nodes": [
                        {
                            "name": "sub_sub_main",
                            "description": "",
                            "owners":{
                                "users":[""]
                            },
                            "terms":[{
                                "name":"",
                                "description":"",
                                "inherits":[""]
                            }]

                        },
                    ]
                }
            ]
        }
    ]
}

I have a large table with multiple hierarchical levels. Sometimes it's just 2 or 3 levels and in others, more. But, all of them are in order.

The other thing is that in inherits section, there must appear the parents above them.

I'm trying to build a recursive function but I have failed so far. I have checked these other similar questions:

Does anyone know any questions similar to this approach? Or if any of you have faced a similar problem?

Thank you all in advance!

Nick ODell
  • 15,465
  • 3
  • 32
  • 66
fserrey
  • 93
  • 8

1 Answers1

2

Given this source dataframe:

df = pd.DataFrame.from_dict( {'Level': {0: 1, 1: 2, 2: 3, 3: 2, 4: 2, 5: 3, 6: 4, 7: 1},
    'Name': {0: 'Main 1',
    1: 'Sub main 1.1',
    2: 'Sub sub main 1.1.1',
    3: 'Sub main 1.2(a)',
    4: 'Sub main 1.2(b)',
    5: 'Sub sub main 1.2.1',
    6: 'Sub sub sub main 1.2.1.1',
    7: 'Main 2'},
    'Description': {0: 'Sandra',
    1: 'Andrew',
    2: 'Sally',
    3: 'Mark',
    4: 'Simon',
    5: 'Sinead',
    6: 'Holly',
    7: 'Max'}})

We can build a tree-formed dictionary that hopefully comes close to what you want:

tree = { 
    'children': [], 
    'ancestors': [], 
    'parent': None, 
    'Level': 0,
    'Name': 'root'
}
curr_node = tree

for index, row in df.iterrows():
    # new child node
    if row.Level > curr_node['Level']: 
        parent = curr_node
        ancestors = parent['ancestors'] + [parent['Name']]
    # sibling node
    elif row.Level == curr_node['Level']:
        parent = curr_node['parent']
        ancestors = curr_node['ancestors'].copy()
    # ...or skip back up the hierarchy
    elif row.Level < curr_node['Level']:
        # skipping up until curr_node is the proper parent for this level
        while curr_node['Level'] >= row.Level:
            curr_node = curr_node['parent']
        parent = curr_node
        ancestors = curr_node['ancestors'] + [parent['Name']]
    # make new node with given parent & ancestors
    curr_node = row.to_dict()
    curr_node['children'] = []
    curr_node['parent'] = parent
    curr_node['ancestors'] = ancestors
    parent['children'].append(curr_node)

Results in:

from pprint import pprint
pprint(tree)

{'Name': 'root',
 'ancestors': [],
 'children': [{'Description': 'Sandra',
               'Level': 1,
               'Name': 'Main 1',
               'ancestors': ['root'],
               'children': [{'Description': 'Andrew',
                             'Level': 2,
                             'Name': 'Sub main 1.1',
                             'ancestors': ['root', 'Main 1'],
                             'children': [{'Description': 'Sally',
                                           'Level': 3,
                                           'Name': 'Sub sub main 1.1.1',
                                           'ancestors': ['root',
                                                         'Main 1',
                                                         'Sub main 1.1'],
                                           'children': [],
                                           'level': 3,
                                           'parent': <Recursion on dict with id=140209092851264>}],
                             'level': 2,
                             'parent': <Recursion on dict with id=140209092850624>},
                            {'Description': 'Mark',
                             'Level': 2,
                             'Name': 'Sub main 1.2(a)',
                             'ancestors': ['root', 'Main 1'],
                             'children': [],
                             'level': 2,
                             'parent': <Recursion on dict with id=140209092850624>},
                            {'Description': 'Simon',
                             'Level': 2,
                             'Name': 'Sub main 1.2(b)',
                             'ancestors': ['root', 'Main 1'],
                             'children': [{'Description': 'Sinead',
                                           'Level': 3,
                                           'Name': 'Sub sub main 1.2.1',
                                           'ancestors': ['root',
                                                         'Main 1',
                                                         'Sub main 1.2(b)'],
                                           'children': [{'Description': 'Holly',
                                                         'Level': 4,
                                                         'Name': 'Sub sub sub '
                                                                 'main 1.2.1.1',
                                                         'ancestors': ['root',
                                                                       'Main 1',
                                                                       'Sub '
                                                                       'main '
                                                                       '1.2(b)',
                                                                       'Sub '
                                                                       'sub '
                                                                       'main '
                                                                       '1.2.1'],
                                                         'children': [],
                                                         'level': 4,
                                                         'parent': <Recursion on dict with id=140209092851520>}],
                                           'level': 3,
                                           'parent': <Recursion on dict with id=140209092851456>}],
                             'level': 2,
                             'parent': <Recursion on dict with id=140209092850624>}],
               'level': 1,
               'parent': <Recursion on dict with id=140209075210048>},
              {'Description': 'Max',
               'Level': 1,
               'Name': 'Main 2',
               'ancestors': ['root'],
               'children': [],
               'level': 1,
               'parent': <Recursion on dict with id=140209075210048>}],
 'level': 0,
 'parent': None}
kleynjan
  • 108
  • 5