how to split dict column from pandas data frame

Question

Splitting dictionary/list inside a Pandas Column into Separate Columns>

The above link providing some solution to my answer

But i have same problem with little different in input. here my DF:

df = pd.DataFrame({'a':[1,2,3], 'b':[[{'c':1},{'c':3}], {'d':3}, {'c':5, 'd':6}]})

My dict again contains list of dicts for Key "b".

My expected O/P :

  [a    c   c1    d 
0  1   1.0  3    NaN  
1  2   NaN  NaN  3.0 
2  3   5.0  NaN  6.0][1]

Could you please help.

it would be helpful to know what you've tried already. Also, what's with the `[1]` at the end of your dataframe? — Paul H, Jan 16 '18 at 08:00
DF1 = DF.loc[pd.notnull(DF.Modes)]["Modes"].apply(lambda x: pd.Series(eval(str(x))[0])) user this operation. to split the dictionary, But only first dict is fetching i want all the list of dicts. — Rakesh Bhagam, Jan 16 '18 at 10:03

jezrael · Answer 1 · 2018-01-16T08:38:27.743

0

You can use:

#convert to list of df with condition for add list
L = [pd.DataFrame(x) if isinstance(x, list) else pd.DataFrame([x]) for x in df['b']]
#join together, reshape and remove all NaNs columns
df1 = pd.concat(L, keys=df.index).unstack().dropna(how='all', axis=1)
#flattening MultiIndex in columns 
df1.columns = ['{}{}'.format(a,b) for a,b in df1.columns]
print (df1)
    c0   c1   d0
0  1.0  3.0  NaN
1  NaN  NaN  3.0
2  5.0  NaN  6.0

#remove original column b and join df1
df = df.drop('b',1).join(df1)
print (df)
   a   c0   c1   d0
0  1  1.0  3.0  NaN
1  2  NaN  NaN  3.0
2  3  5.0  NaN  6.0

edited Jan 16 '18 at 08:38

answered Jan 16 '18 at 08:00

jezrael

822,522
95
1,334
1,252

Thanks Jezrael, This example is not working with my data.. Im reading the data from xls as data frame and applying your code. – Rakesh Bhagam Jan 16 '18 at 09:52
Are data confidental? – jezrael Jan 16 '18 at 09:56
No.. How to share my excel – Rakesh Bhagam Jan 16 '18 at 09:57
Can you send me your code + file to my email from my profile? Thanks. – jezrael Jan 16 '18 at 09:59
I send you email, can you check it? – jezrael Jan 16 '18 at 11:36
i replied to your mail jezrael – Rakesh Bhagam Jan 23 '18 at 08:38
Please check answer. `df['Modes'] = df['Modes'].apply(pd.io.json.loads)` – jezrael Jan 23 '18 at 08:40
my data is not always from xls file .. its a normal DF only.. pd.io is only applicable to some input file.. here my input from some data frame in code. – Rakesh Bhagam Jan 23 '18 at 09:17
hmmm, source of data is not important, but if need convert `json` or `dict` column use `df['Modes'] = df['Modes'].apply(pd.io.json.loads)`. – jezrael Jan 23 '18 at 09:23
facing below error. File "C:\Python36\lib\site-packages\pandas\core\series.py", line 2510, in apply mapped = lib.map_infer(values, f, convert=convert_dtype) File "pandas/_libs/src\inference.pyx", line 1521, in pandas._libs.lib.map_infer ValueError: Expected object or value – Rakesh Bhagam Jan 23 '18 at 14:34
@RakeshBhagam - It means bad data, unfortunately :( You can check [this solution](https://stackoverflow.com/a/48396555/2901002) for find all bad rows annd instaed `ast.literal_eval(x)` use `pd.io.json.loads(x)` – jezrael Jan 23 '18 at 15:05
its return same DF only. Providing Input Again Input : df = pd.DataFrame({'a':[1,2], 'b':[[{'x1':1,'x2':3},{'x1':4,'x2':1}],[{'x1':5},{'x1':3,'x2':6}]], 'c':[5,6]}) Expected Output : X1_0 x2_0 x1_1 x2_1 1 3 4 1 5 NaN 3 6 – Rakesh Bhagam Jan 23 '18 at 16:02
I add for you new answer. But in my opinion problem is with your data. What is source of your data? Excel, csv, json, some API? Because if source is file excel, csv there is problem your lists and dictionaries are converted to strings. So first step is converting from strings to dictionaries and lists and then is possible working with my solutions. – jezrael Jan 24 '18 at 08:46
i understand now. with the new solution. my data is string while reading from .xls. Thanks Jezrael.. – Rakesh Bhagam Jan 24 '18 at 12:08
Yes, so how solution work? Or how working converting from strings to dictionaries? – jezrael Jan 24 '18 at 12:10
I'm using df[Column].apply(ast.literal_eval) to read string and remaining same procedure. – Rakesh Bhagam Jan 24 '18 at 13:49

how to split dict column from pandas data frame

1 Answers1