I have data frame as like df
:
id features
100 [{'city': 'Rio'}, {'destination': '2'}]
110 [{'city': 'Sao Paulo'}]
135 [{'city': 'Recife'}, {'destination': '45'}]
145 [{'city': 'Munich'}, {'destination': '67'}]
167 [{'city': 'Berlin'}, {'latitude':'56'}, {'longitude':'30'}]
I have to extract column name and values from features
column to separate columns as like:
id city destination latitude longitude
100 'Rio' '2' NaN NaN
110 'Sao Paulo' NaN NaN NaN
135 'Recife' '45' NaN NaN
145 'Munich' '67' NaN NaN
167 'Berlin' NaN '56' '30'
I tried to do it with usage idea as like:
1st method to extract:
df = df.explode('features').reset_index(drop = True)
result = pd.concat([df.drop(columns='features'),
pd.json_normalize(df['features'])], axis=1)
result is only id
column.
2nd method:
df = df.explode('features').reset_index(drop = True)
df2 = df.set_index('id')
df2 = df2['features'].astype('str')
df2 = df2.apply(lambda x: ast.literal_eval(x))
df2 = df2.apply(pd.Series)
result = df2.reset_index()
result
is very closed what I need:
id city destination latitude longitude
100 'Rio' NaN NaN NaN
100 NaN '2' NaN NaN
110 'Sao Paulo' NaN NaN NaN
135 'Recife' NaN NaN NaN
135 NaN '45' NaN NaN
145 'Munich' NaN NaN NaN
145 'Munich' '67' NaN NaN
167 'Berlin' NaN NaN NaN
167 NaN NaN '56' NaN
167 NaN NaN NaN '30'
How is possible to achieve an expected result in view of:
id city destination latitude longitude
100 'Rio' '2' NaN NaN
110 'Sao Paulo' NaN NaN NaN
135 'Recife' '45' NaN NaN
145 'Munich' '67' NaN NaN
167 'Berlin' NaN '56' '30'
Thanks