0

I would like to replace NAs of df1 by df2(both of two data frames are large with about 155 columns x 3966 rows) Both of two data frames are with the first column 'ID' and other column names are overlapped. Here is a part of example:

> df1
       ID col1 col2  ...... col154
1  AMM115    C    A  ......     A+
2  ADM107   NA   NA  ......      B
3  AGM041    B    C  ......     C+
4  AGM132    A   NA  ......     A+
5  AQM007   NA    A  ......     B+
6  ARM028   NA   B+  ......     A-
7  ASM019    A   A+  ......     NA

> df2
       ID col1 col2  ...... col154
1  ADM107   A+    B  ......      B
2  AGM041    C    A  ......     B+
3  ARM028   A+   B+  ......     NA
4  AQM007   B+    A  ......     B+

I try this answers here : Can I replace NAs when joining two data frames with dplyr? I know coalesce might be I needed. It can fills the NA from the first vector but how could I across all columns. but I can't figure out how to mutate and coalesce multiple columns like my real data expect first column. Also select suffix .x or .y at this code.

library(dplyr)
df1 %>% 
        left_join(df2, by = "fruit") %>% 
        mutate(var2 = coalesce(var2.x, var2.y)) %>% 
        select(-var2.x, -var2.y)

I want get like df3, if NAs of df1 and df2 have value , using df2 to replace NAs of df1,if not, keep df1 original values.

> df3
       ID col1 col2  ...... col154
1  AMM115    C    A  ......     A+
2  ADM107   A+    B  ......      B
3  AGM041    B    C  ......     C+
4  AGM132    A   NA  ......     A+
5  AQM007   B+    A  ......     B+
6  ARM028   A+   B+  ......     A-
7  ASM019    A   A+  ......     NA
Roq
  • 182
  • 7

1 Answers1

1

You can do:

library(dplyr)
df3 <- df1 %>%
  rows_patch(df2, by = 'ID')
deschen
  • 10,012
  • 3
  • 27
  • 50