0

I have two DataFrames that both have missing values (NaNs) and contain data with the other's missing values. I would like to combine them such that the missing values are filled in from the other DataFrame. Here's an example:

df1 = pd.DataFrame({'color': {1: 'b'}}).T
df2 = pd.DataFrame({'height': {0: 2}}).T
df12 = pd.concat([df1, df2])

df3 = pd.DataFrame({'color': {0:'w'}}).T
df4 = pd.DataFrame({'height': {1: 4}}).T
df34 = pd.concat([df3, df4])

Now I would like to combine df12 with df34 in a way that there are no missing values. But if I do pd.concat([df12, df34]), I get a DataFrame where each row is repeated twice, once with the value and once with NaN. I would like to get a DataFrame with each row not repeated and the values filled in. How can I do that?

jss367
  • 4,759
  • 14
  • 54
  • 76

1 Answers1

0
df1 = pd.DataFrame({'color': {1: 'b'}}).T
df2 = pd.DataFrame({'height': {0: 2}}).T
df12 = pd.concat([df1, df2])

df3 = pd.DataFrame({'color': {0:'w'}}).T
df4 = pd.DataFrame({'height': {1: 4}}).T
df34 = pd.concat([df3, df4])

df12.combine_first(df34)

Output:

0   1
color   w   b
height  2.0 4.0
Will
  • 1,619
  • 5
  • 23