I have a dataset that looks something like this:
col_1 | col_2 | col_3 |
---|---|---|
0 | nan | nan |
nan | 1 | nan |
nan | nan | nan |
nan | nan | 1 |
And I need to collapse those columns into something like this:
foo |
---|
0 |
1 |
nan |
1 |
My first attempt was doing this:
df[columns].values[~df[columns].isna()]
But since there are rows where all the values are nan, I miss those rows.
My second attempt was doing this:
def get_cols_or_nan(row):
mask = ~row.isna()
if np.any(mask):
return row[mask][0]
return float('nan')
df[columns].apply(get_cols_or_nan, axis=1)
But, I don't know why, this is significantly slower to the point that for me it's impracticable.
Is there a more efficient way that I can collapse those columns? I am guaranteed that there's only one non-nan value in each row.