-2

Given the following DataFrame:

import numpy as np
import pandas as pd

df = pd.DataFrame({"ColA": [1, 2, np.nan, 4, np.nan],
                   "ColB": [np.nan, 7, np.nan, 9, 10]})

   ColA  ColB
0   1.0   NaN
1   2.0   7.0
2   NaN   NaN
3   4.0   9.0
4   NaN  10.0

I want to replace the values of ColA with those of ColB, but I never want a NaN value to replace a non-NaN value. My desired output is the following DataFrame:

   ColA  ColB
0   1.0   NaN
1   7.0   7.0
2   NaN   NaN
3   9.0   9.0
4   10.0  10.0

My actual data is quite large, so while I know that I can iterate row-by-row I am looking for a more efficient approach.

Clade
  • 966
  • 1
  • 6
  • 14
  • 1
    this could help: https://stackoverflow.com/questions/38152389/coalesce-values-from-2-columns-into-a-single-column-in-a-pandas-dataframe – Grzegorz Skibinski Aug 27 '19 at 13:52
  • Can't see how a simple `fillna` helps here (in regards to the dupe) – yatu Aug 27 '19 at 13:56
  • 1
    The question is not a duplicate of https://stackoverflow.com/questions/30357276/how-to-pass-another-entire-column-as-argument-to-pandas-fillna. I am not looking merely to fill NaN values, but to replace all values in one column with the values in another, unless the values in the other column are NaN. – Clade Aug 27 '19 at 14:02
  • 1
    Use `numpy.where()`. I cannot place my answer since it has been tagged duplicate hehe. – Joe Aug 27 '19 at 14:05
  • 1
    Yes, its a duplicate. It has already been answered. You could also go with `df['ColA'] = df.ColB.combine_first(df.ColA)` – yatu Aug 27 '19 at 14:13

1 Answers1

1
df['ColA']= np.where(df['ColA'].notna(), np.where(df['ColB'].isna(),df['ColA'],df['ColB']), df['ColA'])
moys
  • 7,747
  • 2
  • 11
  • 42
  • Thank you! This slight modification did the trick: df['ColA']= np.where(df['ColA'].notna(), np.where(df['ColB'].isna(), df['ColA'], df['ColB']), df['ColB']) – Clade Aug 27 '19 at 16:08