Create Nested JSON from Pandas for Org Chart

Question

I'm trying to create a nested JSON object from a hierarchical DataFrame (python 3.5) to feed into JavaScript to render an Org Chart. I'm essentially trying to create the structure found in the answer of this question: Organization chart - tree, online, dynamic, collapsible, pictures - in D3

An example dataframe:

df = pd.DataFrame({\
'Manager_Name':['Mike' ,'Jon', 'Susan' ,'Susan' ,'Joe'],\
'Manager_Title':['Level1' ,'Level2'  ,'Level3' ,"Level3", 'Level4'],\
'Employee_Name':['Jon' ,'Susan' ,'Josh' ,'Joe' ,'Jimmy'],\
'Employee_Title':["Level2" ,"Level3" ,"Level4" ,"Level4" ,"Level5"]})

The desired output would be:

"Name": "Mike"
"Title": "Level1"
"Employees": [{
        "Name": "Jon"
        "Title": "Level2"
        "Employees": [{
               "Name": "Susan"
               "Title": "Level3"
               "Employees": [{
               ...
               ...
               ...
               }]
      }]
}]

I know this isn't a code generating service but I've tried applying other similarly related answers and can't seem to apply those answers here. I also haven't worked with dictionaries that much (I'm more of an R person) so there's probably some noobishness to this question. I've more time than I should on this yet I'm sure someone here can do this in a few minutes.

Other questions:

Thanks in advance!

What is the criteria for the root element? Is it `Title == 'Level1'`? — DocZerø, Aug 18 '16 at 14:36

Parfait · Answer 1 · 2016-08-19T03:18:13.960

Consider filtering out dataframe by Level and converting dfs to dictionary with pandas to_dict() which are continually rolled into one list across levels. Below defined function walks from last level to first to roll up individual Employee Levels dictionaries. But first you should concatenate Manager and Employee Name and Title columns.

import json
import pandas as pd

cdf = pd.concat([df[['Manager_Name', 'Manager_Title']].\
            rename(index=str, columns={'Manager_Name':'Name', 'Manager_Title':'Title'}),
            df[['Employee_Name', 'Employee_Title']].\
            rename(index=str, columns={'Employee_Name':'Name', 'Employee_Title':'Title'})])

cdf = cdf.drop_duplicates().reset_index(drop=True)
print(cdf)
#     Name   Title
# 0   Mike  Level1
# 1    Jon  Level2
# 2  Susan  Level3
# 3    Joe  Level4
# 4   Josh  Level4
# 5  Jimmy  Level5

def jsondict():
    inner = ['']
    for i in ['Level5', 'Level4', 'Level3', 'Level2']:            
        if i == 'Level5':
            inner[0] = cdf[cdf['Title']==i].to_dict(orient='records')            
        else:
            tmp = cdf[cdf['Title']==i].copy().reset_index(drop=True)            
            if len(tmp) == 1:
                tmp['Employees'] = [inner[0]]
            else:
                for d in range(0,len(tmp)):
                    tmp.ix[d, 'Employees'] = [inner[0]]                                
            lvltemp = tmp.to_dict(orient='records')
            inner[0] = lvltemp            
    return(inner)

jsondf = cdf[cdf['Title']=='Level1'].copy()
jsondf['Employees'] = jsondict()    
jsondata = jsondf.to_json(orient='records')

Output

[{"Name":"Mike","Title":"Level1","Employees":
[{"Name":"Jon","Title":"Level2","Employees":
[{"Name":"Susan","Title":"Level3","Employees":
[{"Name":"Joe","Title":"Level4","Employees":
[{"Name":"Jimmy","Title":"Level5"}]},
{"Name":"Josh","Title":"Level4","Employees":
[[{"Name":"Jimmy","Title":"Level5"}]]}]}]}]}]

Or pretty printed

[
  {
    "Name": "Mike",
    "Title": "Level1",
    "Employees": [
      {
        "Name": "Jon",
        "Title": "Level2",
        "Employees": [
          {
            "Name": "Susan",
            "Title": "Level3",
            "Employees": [
              {
                "Name": "Joe",
                "Title": "Level4",
                "Employees": [
                  {
                    "Name": "Jimmy",
                    "Title": "Level5"
                  }
                ]
              },
              {
                "Name": "Josh",
                "Title": "Level4",
                "Employees": [
                  [
                    {
                      "Name": "Jimmy",
                      "Title": "Level5"
                    }
                  ]
                ]
              }
            ]
          }
        ]
      }
    ]
  }
]

This function will only return one employee per manager. It's a good starting point. It needs some tweaking - will give it a shot later — mrp, Aug 18 '16 at 21:51
No, I ran this on the sample dataset shown. The reason this is happening is because you're only take the first set dictionary records: .to_dict(...)[0]. You can also see in your output that Josh (Level 4) is missing — mrp, Aug 18 '16 at 23:28

Create Nested JSON from Pandas for Org Chart

1 Answers1

Linked