I am working on a similar visualization project that requires moving data from a Pandas DataFrame to a JSON file that works with D3.
I came across your post while looking for a solution and ended up writing something based on this GitHub repository and with input from the link you provided in this post.
The code is not pretty and is a bit hacky and slow. But based on my project, it seems to work just fine for any amount of data as long as it has three levels and a value field. You should be able to simply fork the D3 Starburst notebook and replace the flare.json file with this code's output.
The modification that I made here, based on the original GitHub post, is to provide consideration for three levels of data. So, if the name of the level 0 node exists, then append from level 1 and on. Likewise, if the name of the level 1 node exists, then append the level 2 node (the third level). Otherwise, append the full path of data. If you need more, some kind of recursion might do the trick, or just keep hacking it to add more levels
# code snip to format Pandas DataFrame to json for D3 Starburst Chart
# libraries
import json
import pandas as pd
# example data with three levels and a single value field
data = {'group1': ['Animal', 'Animal', 'Animal', 'Plant'],
'group2': ['Mammal', 'Mammal', 'Fish', 'Tree'],
'group3': ['Fox', 'Lion', 'Cod', 'Oak'],
'value': [35000, 25000, 15000, 1500]}
df = pd.DataFrame.from_dict(data)
print(df)
""" The sample dataframe
group1 group2 group3 value
0 Animal Mammal Fox 35000
1 Animal Mammal Lion 25000
2 Animal Fish Cod 15000
3 Plant Tree Oak 1500
"""
# initialize a flare dictionary
flare = {"name": "flare", "children": []}
# iterate through dataframe values
for row in df.values:
level0 = row[0]
level1 = row[1]
level2 = row[2]
value = row[3]
# create a dictionary with all the row data
d = {'name': level0,
'children': [{'name': level1,
'children': [{'name': level2,
'value': value}]}]}
# initialize key lists
key0 = []
key1 = []
# iterate through first level node names
for i in flare['children']:
key0.append(i['name'])
# iterate through next level node names
key1 = []
for _, v in i.items():
if isinstance(v, list):
for x in v:
key1.append(x['name'])
# add the full row of data if the root is not in key0
if level0 not in key0:
d = {'name': level0,
'children': [{'name': level1,
'children': [{'name': level2,
'value': value}]}]}
flare['children'].append(d)
elif level1 not in key1:
# if the root exists, then append only the next level children
d = {'name': level1,
'children': [{'name': level2,
'value': value}]}
flare['children'][key0.index(level0)]['children'].append(d)
else:
# if the root exists, then only append the next level children
d = {'name': level2,
'value': value}
flare['children'][key0.index(level0)]['children'][key1.index(level1)]['children'].append(d)
# uncomment next three lines to save as json file
# save to some file
# with open('filename_here.json', 'w') as outfile:
# json.dump(flare, outfile)
print(json.dumps(flare, indent=2))
""" the expected output of this json data
{
"name": "flare",
"children": [
{
"name": "Animal",
"children": [
{
"name": "Mammal",
"children": [
{
"name": "Fox",
"value": 35000
},
{
"name": "Lion",
"value1": 25000
}
]
},
{
"name": "Fish",
"children": [
{
"name": "Cod",
"value": 15000
}
]
}
]
},
{
"name": "Plant",
"children": [
{
"name": "Tree",
"children": [
{
"name": "Oak",
"value": 1500
}
]
}
]
}
]
}
"""