1

I'm trying to put together two dataframes that have the same columns and number of rows, but one of them have nan in some rows and the other doesn't.

This example is with 2 DF, but I have to do this with around 50 DF and get all dataframes merged in 1.

DF1:

   id   b    c
0  1   15    1
1  2   nan  nan
2  3   2     3
3  4   nan  nan

DF2:

   id   b    c
0  1   nan  nan
1  2   26    6
2  3   nan  nan
3  4   60    3

Desired output:

   id   b    c
0  1   15    1
1  2   26    6
2  3   2     3
3  4   60    3
  • 2
    ``df1.combine_first(df2)`` – sushanth Aug 29 '20 at 18:15
  • 1
    or `df1.fillna(df2)` – anon01 Aug 29 '20 at 18:23
  • Thank you @sushanth and anon01, both answers did the trick. I'm trying to understand why both functions worked for this. – Pedro Pablo Severin Honorato Aug 29 '20 at 18:54
  • 2
    This previous question discusses the difference: https://stackoverflow.com/questions/46676134/what-is-the-difference-between-combine-first-and-fillna. Basically, `fillna` only fills existing `na` col/row values, whereas `combine_first` fits missing values and adds additional cols/rows that don't exist in df1 – anon01 Aug 29 '20 at 19:02

1 Answers1

0

If you have

df1 = pd.DataFrame(np.nan, index=[0, 1], columns=[0, 1])
df2 = pd.DataFrame([[0, np.nan]], index=[0, 1], columns=[0, 1])
df3 = pd.DataFrame([[np.nan, 1]], index=[0, 1], columns=[0, 1])

Then you can update df1

for df in [df2, df3]:
    df1.update(df)

print(df1)

     0    1
0  0.0  1.0
1  0.0  1.0
RichieV
  • 5,103
  • 2
  • 11
  • 24