0

I'm trying to create a nested JSON object from a hierarchical DataFrame (python 3.5) to feed into JavaScript to render an Org Chart. I'm essentially trying to create the structure found in the answer of this question: Organization chart - tree, online, dynamic, collapsible, pictures - in D3

An example dataframe:

df = pd.DataFrame({\
'Manager_Name':['Mike' ,'Jon', 'Susan' ,'Susan' ,'Joe'],\
'Manager_Title':['Level1' ,'Level2'  ,'Level3' ,"Level3", 'Level4'],\
'Employee_Name':['Jon' ,'Susan' ,'Josh' ,'Joe' ,'Jimmy'],\
'Employee_Title':["Level2" ,"Level3" ,"Level4" ,"Level4" ,"Level5"]})

The desired output would be:

"Name": "Mike"
"Title": "Level1"
"Employees": [{
        "Name": "Jon"
        "Title": "Level2"
        "Employees": [{
               "Name": "Susan"
               "Title": "Level3"
               "Employees": [{
               ...
               ...
               ...
               }]
      }]
}]

I know this isn't a code generating service but I've tried applying other similarly related answers and can't seem to apply those answers here. I also haven't worked with dictionaries that much (I'm more of an R person) so there's probably some noobishness to this question. I've more time than I should on this yet I'm sure someone here can do this in a few minutes.

Other questions:

Thanks in advance!

Community
  • 1
  • 1
mrp
  • 689
  • 2
  • 11
  • 28

1 Answers1

0

Consider filtering out dataframe by Level and converting dfs to dictionary with pandas to_dict() which are continually rolled into one list across levels. Below defined function walks from last level to first to roll up individual Employee Levels dictionaries. But first you should concatenate Manager and Employee Name and Title columns.

import json
import pandas as pd

cdf = pd.concat([df[['Manager_Name', 'Manager_Title']].\
            rename(index=str, columns={'Manager_Name':'Name', 'Manager_Title':'Title'}),
            df[['Employee_Name', 'Employee_Title']].\
            rename(index=str, columns={'Employee_Name':'Name', 'Employee_Title':'Title'})])

cdf = cdf.drop_duplicates().reset_index(drop=True)
print(cdf)
#     Name   Title
# 0   Mike  Level1
# 1    Jon  Level2
# 2  Susan  Level3
# 3    Joe  Level4
# 4   Josh  Level4
# 5  Jimmy  Level5

def jsondict():
    inner = ['']
    for i in ['Level5', 'Level4', 'Level3', 'Level2']:            
        if i == 'Level5':
            inner[0] = cdf[cdf['Title']==i].to_dict(orient='records')            
        else:
            tmp = cdf[cdf['Title']==i].copy().reset_index(drop=True)            
            if len(tmp) == 1:
                tmp['Employees'] = [inner[0]]
            else:
                for d in range(0,len(tmp)):
                    tmp.ix[d, 'Employees'] = [inner[0]]                                
            lvltemp = tmp.to_dict(orient='records')
            inner[0] = lvltemp            
    return(inner)

jsondf = cdf[cdf['Title']=='Level1'].copy()
jsondf['Employees'] = jsondict()    
jsondata = jsondf.to_json(orient='records')

Output

[{"Name":"Mike","Title":"Level1","Employees":
[{"Name":"Jon","Title":"Level2","Employees":
[{"Name":"Susan","Title":"Level3","Employees":
[{"Name":"Joe","Title":"Level4","Employees":
[{"Name":"Jimmy","Title":"Level5"}]},
{"Name":"Josh","Title":"Level4","Employees":
[[{"Name":"Jimmy","Title":"Level5"}]]}]}]}]}]

Or pretty printed

[
  {
    "Name": "Mike",
    "Title": "Level1",
    "Employees": [
      {
        "Name": "Jon",
        "Title": "Level2",
        "Employees": [
          {
            "Name": "Susan",
            "Title": "Level3",
            "Employees": [
              {
                "Name": "Joe",
                "Title": "Level4",
                "Employees": [
                  {
                    "Name": "Jimmy",
                    "Title": "Level5"
                  }
                ]
              },
              {
                "Name": "Josh",
                "Title": "Level4",
                "Employees": [
                  [
                    {
                      "Name": "Jimmy",
                      "Title": "Level5"
                    }
                  ]
                ]
              }
            ]
          }
        ]
      }
    ]
  }
]
Parfait
  • 104,375
  • 17
  • 94
  • 125
  • This function will only return one employee per manager. It's a good starting point. It needs some tweaking - will give it a shot later – mrp Aug 18 '16 at 21:51
  • Did you run this on larger dataset? – Parfait Aug 18 '16 at 22:18
  • No, I ran this on the sample dataset shown. The reason this is happening is because you're only take the first set dictionary records: .to_dict(...)[0]. You can also see in your output that Josh (Level 4) is missing – mrp Aug 18 '16 at 23:28