I have a pandas dataframe (sample) as follows
df = pd.DataFrame({'Country':['India', 'China', 'Nepal'],
'Habitat':[[{'city1':'Ind1','city2':'Ind2'},{'town1':'IndT1','town2':'IndT2'}],
[{'city1':'Chi1','city2':'Chi2'},{'town1':'ChiT1','town2':'ChiT2'}],
[{'city1':'Nep1','city2':'Nep2'},{'town1':'NepT1','town2':'NepT2'}]],
'num':[1,2,3]
})
df
Country Habitat num
0 India [{'city1':'Ind1','city2':'Ind2'},{'town1':'IndT1','town2':'IndT2'}] 1
1 China [{'city1':'Chi1','city2':'Chi2'},{'town1':'ChiT1','town2':'ChiT2'}] 2
2 Nepal [{'city1':'Nep1','city2':'Nep2'},{'town1':'NepT1','town2':'NepT2'}] 3
I need to flatten this out in this format.
result_df = pd.DataFrame({'Country':['India', 'China', 'Nepal'],
'Habitat.city1':['Ind1','Chi1','Nep1'],
'Habitat.city2':['Ind2','Chi2','Nep2'],
'Habitat.town1':['IndT1','ChiT1','NepT1'],
'Habitat.town2':['IndT2','ChiT2','NepT2'],
'num':[1,2,3]
})
result_df
Country Habitat.city1 Habitat.city2 Habitat.town1 Habitat.town2 num
India Ind1 Ind2 IndT1 IndT2 1
China Chi1 Chi2 ChiT1 ChiT2 2
Nepal Nep1 Nep2 NepT1 NepT2 3
I have tried pd.json_normalize(df.explode('Habitat')['Habitat])
but it creates new rows which I do not need.
My observation:
Some form of groupby and transpose that can properly build on pd.json_normalize(df.explode('Habitat')['Habitat])
could solve my problem but so far I have not had any luck