This is related to: exploding a pandas dataframe column
Here's my dataframe:
import pandas as pd
import numpy as np
d = {'id': [1, 1, 1, 2, 2, 2], 'data': [{'foo':True}, {'foo':False, 'bar':True}, {'foo':True, 'bar':False, 'baz':True}, {'foo':False}, {'foo':False, 'bar':False}, {'foo':False, 'bar':True, 'baz':False}]}
df = pd.DataFrame(data=d)
df
I'd like to create a new column for each value in column data
with the relevant True
and False
values. (and np.nan
for any null values).
My new dataframe would look like:
a = {'id': [1, 1, 1, 2, 2, 2], 'data': [{'foo':True}, {'foo':False, 'bar':True}, {'foo':True, 'bar':False, 'baz':True}, {'foo':False}, {'foo':False, 'bar':False}, {'foo':False, 'bar':True, 'baz':False}], 'foo':[True, False, True, False, False, False], 'bar':[np.nan, True, False, np.nan, False, True], 'baz':[np.nan, np.nan, True, np.nan, np.nan, False] }
df1 = pd.DataFrame(data=a)
df1
I'm not sure if this can be achieved with Series.str.get_dummies
as I'm not sure how to map the True
and False
values. Appreciate any help!