I am looking to do similar task as in this thread: Merge multiple column values into one column in python pandas
But the difference is that I would like to create a new column that merges all the non-null values in a column after grouping by another column(s). Here is a toy example:
df= pd.DataFrame({'ID1' : [1,1,2,2,3,3,3],'ID2' : ['a','a','b','b','c','c','c'],
'Status' : pd.Series([np.nan,'1', np.nan,'1','2',np.nan,'1'],
dtype="category")})
df
Out[74]:
ID1 ID2 Status
0 1 a NaN
1 1 a 1
2 2 b NaN
3 2 b 1
4 3 c 2
5 3 c NaN
6 3 c 1
I then want to groupby
ID1
and ID2
:
gr = df.groupby(['ID1','ID2'])
Then, I would like my result look like this:
Out:
NewCol
0 1
1 1
2 2,1
So it is a new DataFrame
that contains the non-null
values of Status
column grouped by ID1
and ID2
.
Thanks in advance.