0

From a json file, I know you can use json_normalize to convert said json file into a pandas Dataframe. When there are nested dictionaries, json_normalize does flatten them. However, for my json file, the nested dictionaries are within a list, see Var2 below. How would I access the elements within Var2 and convert them to be there own dataframe?

from pandas.io.json import json_normalize
data = [
        {"Id":1,"Var1":2,"Var2":[{"Var2.1":4,"Var2.2":3}],"Var3":3},
        {"Id":2,"Var1":2,"Var2":[{"Var2.1":4}],"Var3":3},
        {"Id":3,"Var1":2,"Var2":[{"Var2.1":4,"Var2.2":3,"Var2.3":2}],"Var3":3}
        ]
df = json_normalize(data)

I have tried df["Var2"].apply(json_normalize), but that resulted in a really weird output.

How have others solve this problem?

MrBean Bremen
  • 14,916
  • 3
  • 26
  • 46
Jack Armstrong
  • 1,182
  • 4
  • 26
  • 59

1 Answers1

3
from pandas.io.json import json_normalize
df = json_normalize(data)
df['Var2'] = df['Var2'].apply(pd.Series)

result = df['Var2'].apply(pd.Series)

df:

Id  Var1    Var2                                    Var3
0   1   2   {'Var2.1': 4, 'Var2.2': 3}              3
1   2   2   {'Var2.1': 4}                           3 
2   3   2   {'Var2.1': 4, 'Var2.2': 3, 'Var2.3': 2} 3

result:

    Var2.1  Var2.2  Var2.3
0   4.0     3.0     NaN
1   4.0     NaN     NaN
2   4.0     3.0     2.0
Pygirl
  • 12,969
  • 5
  • 30
  • 43