R canonical dplyr way of replace if

Question

Suppose you have:

df = data.frame(a = c(1,2,NA),b = c(NA, 1,2))
> df
   a  b
1  1 NA
2  2  1
3 NA  2

and wish to create a new column c based on a. If a is missing, then use b. This works:

df %>% mutate(c= a,
              c = replace(c, is.na(a), b[is.na(a)]))

but (to me, just me?) looks clumsy (in the sense that I have to spell out is.na(a) twice). This is easier:

df %>%
   rowwise() %>% 
   mutate(c = a,
          c = replace(c, is.na(a), b]))

but it requires the extra rowwise() command, and I could imagine situatoins where sum of my mutate statesments will not work rowwise.

Am I missing a some dplyr feature that makes this (very common task?) easier?

a simple `ifelse` ? `df %>% mutate(c = ifelse(is.na(a), b, a))` — Ronak Shah, Sep 14 '19 at 14:12

tmfmnk · Accepted Answer · 2019-09-14T14:15:34.523

5

For this, you can use coalesce() from dplyr:

df %>%
 mutate(c = coalesce(a, b))

   a  b c
1  1 NA 1
2  2  1 2
3 NA  2 2

From the documentation:

Given a set of vectors, coalesce() finds the first non-missing value at each position.

Or if you want to apply it on the whole df:

df %>%
 mutate(c = coalesce(!!!.))

edited Sep 14 '19 at 14:15

answered Sep 14 '19 at 14:09

tmfmnk

1 Answers1