1

Given below is my dataframe

df = pd.DataFrame({'Col1':['1','2'],'Col2':[{'a':['a1','a2']},{'b':['b1']}]})

    Col1    Col2
0   1       {u'a': [u'a1', u'a2']}
1   2       {u'b': [u'b1']}

I need to reformat this data frame as below

   Col1     NCol2   NCol3
0   1         a     a1
1   1         a     a2
2   2         b     b1

Basically, for each key value pair in the dictionary, i am adding a row with key and value in Ncol2 and Ncol3.

Thanks for help in advance.

Jithin P James
  • 752
  • 1
  • 7
  • 23
  • How did you get this dataframe in that format in the first place? It might make more sense to fix the process of creating it rather than trying to transform if after the fact. – Dan Oct 22 '19 at 09:36
  • Even i would have prefered it that way, but this is how i got my dataset – Jithin P James Oct 22 '19 at 09:44
  • Please refer- https://stackoverflow.com/questions/27263805/pandas-column-of-lists-create-a-row-for-each-list-element/48532692#48532692 Looks like a nice solution up there – Jithin P James Oct 23 '19 at 06:54

1 Answers1

2

You can use the following solution:

df1 = df['Col2'].apply(pd.Series).apply(lambda x: x.explode())\
.stack().reset_index(level=1)

df1.columns = ['Col2', 'Col3']

df.drop('Col2', axis=1).merge(df1, left_index=True, right_index=True)\
.reset_index(drop=True)

Output:

  Col1 Col2 Col3
0    1    a   a1
1    1    a   a2
2    2    b   b1
Mykola Zotko
  • 15,583
  • 3
  • 71
  • 73
  • `AttributeError: ("'Series' object has no attribute 'explode'", 'occurred at index a')` – Dan Oct 22 '19 at 09:58
  • 2
    @Dan Need to update your Pandas. It's new in Pandas 0.25.0. – Mykola Zotko Oct 22 '19 at 09:59
  • Got another solution without using explode. This is from answer for question - https://stackoverflow.com/questions/27263805/pandas-column-of-lists-create-a-row-for-each-list-element/48532692#48532692 ```lst_col = 'attr_list' df_new = pd.DataFrame({ col:np.repeat(df[col].values, df[lst_col].str.len()) for col in df.columns.drop(lst_col)} ).assign(**{lst_col:np.concatenate(df[lst_col].values)})[df.columns]``` – Jithin P James Oct 23 '19 at 06:51