0

I have a pandas dataframe

df = pd.DataFrame({'user':[1,2,2,3], 'date':['2023-04-12', '2023-04-13', '2023-04-15','2023-04-18'], 
                   'variable':['x1','x1','x2','x1'], 'sth':['xx','yy','yy','zz']})

user    date    variable    sth
0   1   2023-04-12  x1  xx
1   2   2023-04-13  x1  yy
2   2   2023-04-15  x2  yy
3   3   2023-04-18  x1  zz

and would like to stack it such that I receive this dataframe

user    sth x1  x2
0   1   xx  2023-04-12  NaN
1   2   yy  2023-04-13  2023-04-15
2   3   zz  2023-04-18  NaN

How do I need to do this?

corianne1234
  • 634
  • 9
  • 23
  • `df.pivot(index=['user', 'sth'], columns='variable', values='date').reset_index()` – mozway Apr 27 '23 at 08:36
  • This does not work: ValueError: Length of passed values is 4, index implies 2 – corianne1234 Apr 27 '23 at 08:43
  • This certainly works with the provided example, please update your question if needed. – mozway Apr 27 '23 at 08:45
  • I have copy pasted the exact thing I have written above and I get this error above. Does it work for you? – corianne1234 Apr 27 '23 at 08:47
  • pivot does not accept list of columns as index so you need to use pivot_table.pd.pivot_table(df, index=['user', 'sth'], columns='variable', values='date',aggfunc= 'first' ).reset_index() – corianne1234 Apr 27 '23 at 08:49
  • It does. Which pandas version are you using (`print(pd.__version__)`)? Likely an old one. – mozway Apr 27 '23 at 08:54

0 Answers0