4

I have 2 data.frames (df1 et df2) with some empty cells (NAs).

df1<-data.frame(code=c("A","B","C","D"),
                x=c(2.3,NA,3.1,2.6),
                y=c(4.1,2,NA,8))

df2<-data.frame(code=c("A","B","C","D"),
                x=c(NA,8.1,NA,NA),
                y=c(NA,NA,0.5,NA))

I want to fill the NA cells in df1 with the corresponding value in df2.

Expected results :

  code   x   y
1    A 2.3 4.1
2    B 8.1 2.0
3    C 3.1 0.5
4    D 2.6 8.0

I managed to do it with for loops (scanning each cell).

It works but I guess there is a more efficient way of doing this... and I love to learn new tricks...

Thanks in advance

Maël
  • 45,206
  • 3
  • 29
  • 67
SylvainC
  • 57
  • 2

4 Answers4

2

A possible solution, using purrr::map2_dfc:

library(tidyverse)

map2_dfc(df1, df2, ~ if_else(is.na(.x), .y, .x))

#> # A tibble: 4 × 3
#>   code      x     y
#>   <chr> <dbl> <dbl>
#> 1 A       2.3   4.1
#> 2 B       8.1   2  
#> 3 C       3.1   0.5
#> 4 D       2.6   8
PaulS
  • 21,159
  • 2
  • 9
  • 26
1

Using coalesce:

library(dplyr)
do.call(coalesce, list(df1, df2))

  code   x   y
1    A 2.3 4.1
2    B 8.1 2.0
3    C 3.1 0.5
4    D 2.6 8.0
Maël
  • 45,206
  • 3
  • 29
  • 67
1

For speed

# set as data.table
lapply(list(df1, df2), \(i) setDT(i))

# custom efficient coalesce
coalesce2 <- function(...)
  {
  Reduce(function(x, y) {
    i <- which(is.na(x))
    x[i] <- y[i]
    x},
    list(...))
  }

# join
df3 <- df2[df1, on =.(code)]

# apply coalesce
df3[, `:=` (x = coalesce2(i.x, x)
            , y = coalesce2(i.y, y)
            )
    ][, c('i.x', 'i.y') := NULL
      ]
Sweepy Dodo
  • 1,761
  • 9
  • 15
  • custom coalesce [courtesy of](https://stackoverflow.com/questions/19253820/how-to-implement-coalesce-efficiently-in-r) – Sweepy Dodo Feb 02 '22 at 12:24
1

A base R option

df1[is.na(df1)] = as.numeric(df2[is.na(df1)])
df1
Donald Seinen
  • 4,179
  • 5
  • 15
  • 40