Grouping Pandas DataFrame with NaNs

Question

I have a DataFrame df1

df1 = pd.DataFrame({
    "id": [1, 1, 2, 2, 3, 3],
    "text": ["a", "a", "b", "b", np.nan, np.nan],
    "value1": [2, np.nan, 6, np.nan, 7, np.nan],
    "value2": [np.nan, 8, np.nan, 1, np.nan, 9],
    "value3": [np.nan, np.nan, np.nan, np.nan, np.nan, np.nan]
})

   id text  value1  value2  value3
0   1    a     2.0     NaN     NaN
1   1    a     NaN     8.0     NaN
2   2    b     6.0     NaN     NaN
3   2    b     NaN     1.0     NaN
4   3  NaN     7.0     NaN     NaN
5   3  NaN     NaN     9.0     NaN

My goal is to group df1 by id to get the following result:

   id text  value1  value2  value3
0   1    a       2       8     NaN
1   2    b       6       1     NaN
2   3  NaN       7       9     NaN

I tried the following code:

def f(x):
    y = x.dropna()
    return np.nan if y.empty else y

df1.groupby(["id"]).agg(f).reset_index()

but in the result the "text" column is missing because of the NaN.

   id  value1  value2  value3
0   1     2.0     8.0     NaN
1   2     6.0     1.0     NaN
2   3     7.0     9.0     NaN

How can I solve this?

Does `df1.groupby('id', as_index=False).first()` give you what you need? — Chris Adams, Jun 17 '21 at 08:39

score 1 · Answer 1 · answered Jun 17 '21 at 08:42

1

Just groupby id, call the first and reset the index.

df1.groupby('id').first().reset_index()

   id  text  value1  value2  value3
0   1     a     2.0     8.0     NaN
1   2     b     6.0     1.0     NaN
2   3  None     7.0     9.0     NaN

answered Jun 17 '21 at 08:42

ThePyGuy

17,779
5
18
45

score 0 · Answer 2 · answered Jun 17 '21 at 08:39

0

try:

out=df1.groupby('id',dropna=False,as_index=False).first()

output of out:

    id  text    value1  value2  value3
0   1   a       2.0     8.0     NaN
1   2   b       6.0     1.0     NaN
2   3   None    7.0     9.0     NaN

answered Jun 17 '21 at 08:39

Anurag Dabas

23,866
9
21
41

dropna is not supported inside the groupby – Bharat Kotwani Jun 17 '21 at 08:57
@BharatKotwani I don't think so bcz it's not depricated yet – Anurag Dabas Jun 17 '21 at 11:46
I have tried and tested that's why I am telling. – Bharat Kotwani Jun 18 '21 at 19:06

Grouping Pandas DataFrame with NaNs

2 Answers2