R: case_when producing unexpected "NA" with dplyr mutate

Question

I have the following user defined function

vareas1 <- function(a, b, c) {
  case_when(a == 1 ~ "top",
            b == 1 ~ "left",
            c == 1 ~ "right",
            near(a, 1/3) && near(b, 1/3) && near(c, 1/3) ~ "centre"
  )
}

test2 <- vareas1(1/3, 1/3, 1/3)

evaluates correctly to

[1] "centre.

However, when applying it via mutate from dplyr, it sometimes produces NA. Example follows:

test1 <- data.frame("a" = c(1, 0, 0, 1/3),
                "b" = c(0, 1, 0, 1/3), 
                "c" = c(0, 0, 1, 1/3)) %>% mutate(area1 = vareas1(a, b, c))

This results in:

          a         b         c area1
1 1.0000000 0.0000000 0.0000000   top
2 0.0000000 1.0000000 0.0000000  left
3 0.0000000 0.0000000 1.0000000 right
4 0.3333333 0.3333333 0.3333333  <NA>

The NA in line [4] instead of the result "centre" was unexpected and I do not understand where it comes from.

I thought it may be due to the class of columns a, b and c and I adapted the function

  vareas1_int <- function(a, b, c) {
            case_when(a == as.integer(1 * 10e6) ~ "top",
                      b == as.integer(1 * 10e6) ~ "left",
                      c == as.integer(1 * 10e6) ~ "right",
                      near(a, as.integer(1/3 * 10e+6) && 
                      near(b, as.integer(1/3 * 10e+6)) && 
                      near(c, as.integer(1/3 * 10e+6))) ~ "centre"
  )
}

and changed a, b, c to fitting integers:

test1 <- test1 %>%
mutate(a_mil = as.integer(a * 10e+6),
     b_mil = as.integer(b * 10e+6),
     c_mil = as.integer(c * 10e+6))

But the oucome was the same:

      a         b         c area1    a_mil    b_mil    c_mil area_int
1 1.0000000 0.0000000 0.0000000   top 10000000        0        0      top
2 0.0000000 1.0000000 0.0000000  left        0 10000000        0     left
3 0.0000000 0.0000000 1.0000000 right        0        0 10000000    right
4 0.3333333 0.3333333 0.3333333  <NA>  3333333  3333333  3333333     <NA>

Thank you for your help!

(This similar post doesn't cover my question.)

Not sure if `&&` is vectorised, any reason not to use `&`? Don't think anything is wrong with `case_when` here, it's just the matter of operators. — arg0naut91, Mar 01 '19 at 15:08
@arg0naut might be right. Some posts already deal with the difference between `&` and `&&` ([here](https://stackoverflow.com/q/51794058/5325862)'s one). — camille, Mar 01 '19 at 15:15
arg0naut is right, see `?"&"`, specifically the second sentence of the *Details* section. I'm sure there is a good dupe out there, but @camille's isn't quite it---it focuses on a strange case where one of the inputs is length 0.... too bad it's difficult to search SO for symbols. — Gregor Thomas, Mar 01 '19 at 15:18
@Gregor true, definitely not a dupe but I liked the explanation of `&&` short-circuiting. [Here](https://stackoverflow.com/q/29898873/5325862)'s a more FAQ-type of post. (Also fyi I used `"&&"` in my search term, with the quotation marks, to match the symbols, but still not the best) — camille, Mar 01 '19 at 15:23
Yeah, they're good questions, interesting and related, but as you say, not dupes. Thanks for the quotes tip, that's super handy! — Gregor Thomas, Mar 01 '19 at 15:26

score 6 · Accepted Answer · answered Mar 01 '19 at 15:18

You need & instead of && in order to make your function work with vectors.

library(tidyverse)

vareas1 <- function(a, b, c) {
  case_when(a == 1 ~ "top",
    b == 1 ~ "left",
    c == 1 ~ "right",
    near(a, 1/3) & near(b, 1/3) & near(c, 1/3) ~ "centre"
  )
}

data.frame("a" = c(1, 0, 0, 1/3),
  "b" = c(0, 1, 0, 1/3), 
  "c" = c(0, 0, 1, 1/3)) %>% mutate(area1 = vareas1(a, b, c))

R: case_when producing unexpected "NA" with dplyr mutate

1 Answers1