7

Suppose I have a data frame with a bunch of columns where I want to do the same NA replacement:

dd <- data.frame(x = c(NA, LETTERS[1:4]), a = rep(NA_real_, 5), b = c(1:4, NA))

For example, in the data frame above I'd like to do something like replace_na(dd, where(is.numeric), 0) to replace the NA values in columns a and b.

I could do

num_cols <- purrr::map_lgl(dd, is.numeric)
r <- as.list(setNames(rep(0, sum(num_cols)), names(dd)[num_cols]))
replace_na(dd, r)

but I'm looking for something tidier/more idiomatic/nicer ...

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453

1 Answers1

7

If we need to dynamically do the replacement with where(is.numeric), can wrap it in across

library(dplyr)
library(tidyr)
dd %>%
   mutate(across(where(is.numeric), replace_na, 0))

Or we can specify the replace as a list of key/value pairs

replace_na(dd, list(a = 0, b = 0))

which can be programmatically created by selecting the columns that are numeric, get the names, convert to a key/value pair with deframe (or use summarise with 0) and then use replace_na

library(tibble)
dd %>% 
  select(where(is.numeric)) %>%
  summarise(across(everything(), ~ 0)) %>%
  replace_na(dd, .)
akrun
  • 874,273
  • 37
  • 540
  • 662
  • 2
    the second is the one I wanted to avoid (since it's hard to do with `tidyselect`/programmatically), the first is the solution I was looking for (vector-wise application). Can't accept for a little while longer. – Ben Bolker Dec 13 '21 at 21:09