1

I have the following dataframe

                item1                     item2                        item3
777    {'value1':x, 'value2':a}    {'value1':y, 'value2':a}    {'value1':z, 'value2':c}
778    {'value1':x, 'value2':b}    {'value1':z, 'value2':c}    { }
779    {'value1':y, 'value2':a}    {'value1':z, 'value2':d}    {'value1':w, 'value2':b}
...

How can form the following dataframe,

          item1        value2      item2        value2       item3  value2

 777       x              a          y             a           z      c     
 778       x              b          z             c         none       none
 779       y              a          z             d           w      b

The main dataframe is:

df = pd.DataFrame({'item1':[{'value1':'x', 'value2':'a'}, {'value1':'x', 'value2':'b'}, {'value1':'y', 'value2':'a'}], 'item2':[{'value1':'y', 'value2':'a'}, {'value1':'z', 'value2':'c'}, {'value1':'z', 'value2':'d'}], 'item3':[{'value1':'z', 'value2':'c'}, {'value1':'none', 'value2':'none'}, {'value1':'w', 'value2':'b'}]})

So, I try with .apply(pd.Series), but I don't see how can do this. Any hints will be appreciated. Thanks!

DjaouadNM
  • 22,013
  • 4
  • 33
  • 55
  • Possible duplicate of [Pandas split column of (unequal length) list into multiple columns](https://stackoverflow.com/questions/55228901/pandas-split-column-of-unequal-length-list-into-multiple-columns) – Mike Aug 26 '19 at 23:14

2 Answers2

1

You may achieve exacly your output via a simple dict comprehension and str.get()

pd.concat([pd.DataFrame({ col    : df[col].str.get('value1'), 
                         'value2': df[col].str.get('value2')}) \
                for col in df.columns],
           axis=1)

    item1 value2 item2 value2 item3 value2
777     x      a     y      a     z      c
778     x      b     z      c  None   None
779     y      a     z      d     w      b

Notice that it is definitely not recommended to have columns with duplicate names. You can always change 'value2' to f'{col}-value2' as the key in your dict comprehension to avoid ambiguity.

rafaelc
  • 57,686
  • 15
  • 58
  • 82
0

Try constructing new dataframe from stacking of columns of dicts:

pd.DataFrame(df.stack().tolist(), index=df.stack().index).unstack().sort_index(level=1, axis=1)

Out[480]:
    value1 value2 value1 value2 value1 value2
     item1  item1  item2  item2  item3  item3
777      x      a      y      a      z      c
778      x      b      z      c   none   none
779      y      a      z      d      w      b
Andy L.
  • 24,909
  • 4
  • 17
  • 29