Replace column condition dplyr

Question

ID <- c("A","B","C","D","E")
AT <- c(0.1,0.2,0.3,0.4,0.5)
US <- c(NA,NA,0.6,NA,NA)
FIGX <- c(1,NA,NA,2,3)
W1 <- c(NA,10,20,30,40)
test.Data <- data.frame(ID,AT,US,FIGX,W1) %>% as.data.table()

I have this kind of table. I would like to replace the values of column US by values of FIGX if NA, and if FIGX is NA, then to replace by the W1 column values.

I have tried this test.Data %>% mutate_if(is.na(US),mutate_if(is.na(FIGX),W1)) without success. How should I do?

Does this answer your question? [How to implement coalesce efficiently in R](https://stackoverflow.com/questions/19253820/how-to-implement-coalesce-efficiently-in-r) — camille, Dec 10 '19 at 18:28
@camille I'd say that's not a good dupe - that's about how to write a function that does `coalesce` (and pre-dates the `dplyr` implementation by many years). This question is about using a `coalesce`-like function in dplyr, and the root of the issue seems to be a confusion between `mutate_if` and `ifelse`. — Gregor Thomas, Dec 10 '19 at 19:45
@IRT, your attempt would work if you use `ifelse` inside mutate, `test.Data %>% mutate(ifelse(is.na(US), ifelse(is.na(FIGX), W1, FIGX), US))`, but `coalesce` is made for replacing `NA`s like this and is much cleaner. Use `mutate_if` if you want to perform the exact same operation on multiple columns that meet a condition - e.g., round all numeric columns, or log all numeric columns that are strictly positive, or add a week to all date columns. — Gregor Thomas, Dec 10 '19 at 19:48
@Gregor that's fair, although the dplyr function is included in at least one answer there. I don't think the `mutate_if` issue is key though, since it's not the right use of that function anyway — camille, Dec 10 '19 at 20:18
I think OP's misunderstanding of `mutate_if` is the core of the question. OP thinks `mutate_if` works like `ifelse`. Letting them know about `coalesce` solves the specific case of replacing `NA`s, but the `mutate_if` confusion should be addressed now so they don't have a question tomorrow about why `mutate_if(x > 0, sqrt(x), -sqrt(-x))` doesn't work for a signed square root. — Gregor Thomas, Dec 10 '19 at 20:28

akrun · Answer 1 · 2019-12-10T20:34:07.790

0

Here is an option with daa.table as the input data is already a data.table

library(data.table)
library(dplyr)
test.Data[is.na(US), US := coalesce(FIGX, W1)][]
test.Data
#   ID  AT   US FIGX W1
#1:  A 0.1  1.0    1 NA
#2:  B 0.2 10.0   NA 10
#3:  C 0.3  0.6   NA 20
#4:  D 0.4  2.0    2 30
#5:  E 0.5  3.0    3 40

For multiple columns, we can use set

colsOfInterest <- c("US", "AT")
for(nm in colsOfInterest){
   i1 <- which(is.na(test.Data[[nm]]))
   set(test.Data, i = i1 ,
         j = nm, value = coalesce(test.Data[["FIGX"]][i1], test.Data[["W1"]][i1]))
  }

edited Dec 10 '19 at 20:34

answered Dec 10 '19 at 18:16

akrun

874,273
37
540
662

2

`coalesce(US, FIGX, W1)`, no `case_when` needed. – Gregor Thomas Dec 10 '19 at 18:17
How to generalize if I need to do this not only for US but for all possible columns (in fact I have like 30 columns ). I have been looking for a loop way. – IRT Dec 10 '19 at 20:28
Hi, when I execute the lines , I got an error: Error i set(tmp,i=i1,j=nm,value = coalesce(tmp[["FIGX"]][i1],tmp[["W1"]][i1]): unused argument (tmp[["W1"]][i1]). Any idea why? – IRT Dec 11 '19 at 06:53
Sorry, I can't reproduce it – akrun Dec 11 '19 at 13:12

Replace column condition dplyr

1 Answers1