Assume that I have a pandas DataFrame called df
that looks something like:
source tables columns
src1 table1 col1
src1 table1 col2
src1 table2 col1
src2 table1 col1
src2 table1 col2
My current code below can iterate through the list of sources and nest the list of tables within each source as an object:
data = [
{k: v}
for k, v in df.groupby('source')['tables'].agg(
lambda x: {v: {} for v in x}).items()
]
with open('data.json', 'w') as f:
json.dump(data, f, indent = 2)
The output I'm receiving with this code is as follows:
[
{
"src1": {
"table1": {},
"table2": {}
}
},
{
"src2": {
"table1": {},
}
}
]
My desired output:
[
{
"src1": {
"table1": {
"col1": {},
"col2": {}
},
"table2": {
"col1": {}
}
}
},
{
"src2": {
"table1": {
"col1": {}
}
}
}
]
Any assistance in converting my 2-layer nested JSON file to 3 layers as shown above would be greatly appreciated. Thank you in advance.