1

I have a pandas dataframe:

 col1   col2   col3

  a      NaN    NaN
  b      1       2
  b      3       4
  c      5       6

I would like to change it to a dataframe like this:

 col1    col2  col3
  a      NaN      NaN
  b     [1,3]    [2,4]
  c       5       6

Is there a simple way to achieve this?

vaeinoe
  • 125
  • 6
  • 1
    Does this answer your question? [How to group dataframe rows into list in pandas groupby](https://stackoverflow.com/questions/22219004/how-to-group-dataframe-rows-into-list-in-pandas-groupby) – Tim Biegeleisen Jan 21 '21 at 10:12
  • Mostly, thank you! It makes everything a list, though, where in my example there are no one element lists (but I didn't really think of that at the time of wrtiting the question :D ) – vaeinoe Jan 21 '21 at 16:03

1 Answers1

2

You need custom lambda function for lists only if length is greater like 1:

df1 = df.groupby('col1').agg(lambda x: list(x) if len(x) > 1 else x).reset_index()
print (df1)
  col1        col2        col3
0    a         NaN         NaN
1    b  [1.0, 3.0]  [2.0, 4.0]
2    c         5.0         6.0

because if aggregate by list get also one element lists:

print (df.groupby('col1').agg(list))
            col2        col3
col1                        
a          [nan]       [nan]
b     [1.0, 3.0]  [2.0, 4.0]
c          [5.0]       [6.0]
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
  • Thank you for the quick solution! It's really weird but with this method, in some cases, what is originally a NaN value gets turned into "0 NaN Name: col2, dtype: object " which is a python object :) – vaeinoe Jan 21 '21 at 15:54