get the non nan value from columns

Question

I have a dataset that looks something like this:

col_1	col_2	col_3
0	nan	nan
nan	1	nan
nan	nan	nan
nan	nan	1

And I need to collapse those columns into something like this:

foo
0
1
nan
1

My first attempt was doing this:

df[columns].values[~df[columns].isna()]

But since there are rows where all the values are nan, I miss those rows.

My second attempt was doing this:

def get_cols_or_nan(row):
    mask = ~row.isna()
    if np.any(mask):
        return row[mask][0]
    
    return float('nan')
    

df[columns].apply(get_cols_or_nan, axis=1)

But, I don't know why, this is significantly slower to the point that for me it's impracticable.

Is there a more efficient way that I can collapse those columns? I am guaranteed that there's only one non-nan value in each row.

what should happen if you also have a number in the first row for col2? — mozway, May 11 '22 at 11:58
I hadn't seen the last comment, then it's covered in [here](https://stackoverflow.com/questions/31828240/first-non-null-value-per-row-from-a-list-of-pandas-columns) — mozway, May 11 '22 at 12:03

score 2 · Accepted Answer · answered May 11 '22 at 11:57

2

If you have at most one non-NA value per row, you can use:

df.stack().droplevel(1).reindex_like(df)

output:

0    0.0
1    1.0
2    NaN
3    1.0
dtype: float64

answered May 11 '22 at 11:57

mozway

194,879
13
39
75

score 0 · Answer 2 · answered May 11 '22 at 12:00

0

You can find the max number in rows with df.max(axis=1)

out = df.max(axis=1).to_frame('foo')

print(out)

   foo
0  0.0
1  1.0
2  NaN
3  1.0

answered May 11 '22 at 12:00

Ynjxsjmh

28,441
6
34
52

get the non nan value from columns

2 Answers2