I am not sure how to word this properly, so I will try to explain this with a replicable example.
I have thousands of entries in a pandas.DataFrame
object. I want to export each row as its own json file with a few keys that are not explicitly available in the data frame's structure.
My data frame, df
, looks as follows:
> df = pd.DataFrame({'ID':['0','1','2'],
'NAME':['jimmy','james','joben'],
'LAST':['neutron','baxter','segel'],
'ADDRESS':['101 ocean avenue','202 bubble gum county','303 china town'],
'COUNTY':['yellow card park','candy kingdom','universal studio']})
ID | NAME | LAST | ADDRESS | COUNTY |
---|---|---|---|---|
0 | jimmy | neutron | 101 ocean avenue | yellow card park |
1 | james | baxter | 202 bubble gum county | candy kingdom |
2 | joben | segel | 303 china town | universal studio |
I want to transform each row to json files with the following structure:
- 'ID'
- 'PERSONAL', which encloses
NAME
andLAST
- 'GEO', which encloses
ADDRESS
andCOUNTY
In other words, I would like the first transformed row to look as follows:
{
'ID':'0',
'PERSONAL':
{
'NAME':'jimmy',
'LAST':'neutron'
},
'GEO':
{
'ADDRESS':'101 ocean avenue',
'COUNTY':'yellow card park'
}
}
I was thinking of converting df
into a pandas.MultiIndex
structure to add the PERSONAL
and GEO
indices and then export each row with pandas.DataFrame.to_json
. However, I'm not sure if this will work and wanted to ask around prior to trying this.