1

As a follow-up to my previous question on the usage of && and || in if statements, I am wondering whether there is any drawback to replacing && and || by purely scalar operators in my own package. These should produce exactly the same results as their base counterparts, except that they throw error messages when applied to arguments of length > 1. In particular, they do short-circuiting.

`&&` <- function(x, y) {
    if (!x[1]) {
        stopifnot(length(x) == 1)
        return(FALSE)
    }
    stopifnot(length(x) == 1)
    if (!y[1]) {
        stopifnot(length(y) == 1)
        return(FALSE)
    }
    stopifnot(length(y) == 1)
    TRUE
}

`||` <- function(x, y) {
    if (x[1]) {
        stopifnot(length(x) == 1)
        return(TRUE)
    }
    stopifnot(length(x) == 1)
    if (y[1]) {
        stopifnot(length(y) == 1)
        return(TRUE)
    }
    stopifnot(length(y) == 1)
    FALSE
}

Is it safe to do these replacements in a package? Or might that break something? (Except functions that actually rely on the vectorised behavior, obviously...)

Note that I do not plan to export these functions; they are only for internal use in my package.

If I understand it correctly, these replacements should then only affect my own functions. They can't affect anything outside of my package, right?

Community
  • 1
  • 1
Eike P.
  • 3,333
  • 1
  • 27
  • 38
  • 6
    I personally wouldn't do it. You can't just use different function names? – Rich Scriven Apr 28 '15 at 23:40
  • At first I wanted to, but ran into problems due to operator precedence. Custom binary operators seem to have lower precedence than, e.g., `==`, which can lead to problems in typical `if` conditions. This can of course be solved by adding more parentheses, but I did not want to change the syntax. That said, if this really doesn't break anything, I actually do prefer overloading the original operators in order to make sure I really use the new ones everywhere! (Since I consider them more safe; also see the referenced question.) – Eike P. Apr 28 '15 at 23:47
  • 2
    They should affect any subsequent expressions in the `globalenv()` or calls that will eventually reach there along the current search path. Functions defined this way are said to be "mask"-ing the base operators and you should see a warning at the time your package is loaded. There's nothing in your code that would restrict their effect to only your package functions. If you want them to not be exported, then you should study how R NAMESPACEs function. `&` and `&&` have the same precedence. Read `?Syntax`. If you had defined `%&&%` it would have had higher precedence. – IRTFM Apr 29 '15 at 01:18
  • 2
    And many people have found this useful: http://blog.obeautifulcode.com/R/How-R-Searches-And-Finds-Stuff/ – IRTFM Apr 29 '15 at 01:26
  • @BondedDust I did not plan to export them! I only want to use them internally. (Which means I actually don't have to change anything in the NAMESPACE file...) I'll edit the question to make this more clear. Any that link is really extraordinarily useful; I used it for compiling what I came up with. ;) – Eike P. Apr 29 '15 at 07:42
  • @BondedDust I mixed it up, higher precedence is of course correct. But this means I need to do `(a == 1) %&% (b == 2)` instead of `a == 1 && b == 2`. – Eike P. Apr 29 '15 at 07:44
  • 1
    I didn't think that you were proposing creating `%&%`. I was only suggesting that precedence would be a problem if you had done so, and hoping to give you an authoritative reference for R's ordering of precedence. I do not see that redefining `&&` and `||` as you propose would break any code that depended on expectations of "vectorization". – IRTFM Apr 29 '15 at 14:18
  • @BondedDust thanks, that's helpful! – Eike P. Apr 29 '15 at 14:33
  • Another question about dealing with NAs: http://stackoverflow.com/q/16822426/210673 – Aaron left Stack Overflow May 01 '15 at 01:15

1 Answers1

1

No. You might be able to make it safe for the computer, but code needs to be safe for the reader too. Someday you'll want to share this code with others, or you'll come back to it yourself after a time away, and this kind of overloading will be confusing. Better by far to use a new function name.

Also, you'll need to be very careful about how your functions handle NA and NULL values. Your current version has different behaviors than the usual ones for NA, and possibly for NULL; I didn't do a full check. I was surprised to notice, though, that when one of the inputs is NULL, order matters for the standard R versions.

EDIT: This issue with NA values is something to think hard about, and possibly another reason to not just write a single function intended to replace && and ||. For flow control, you need either TRUE or FALSE, NA will throw an error. So should TRUE && NA throw an error? Or be FALSE? Or even TRUE? It may depend, and doing the usual thing of if(!is.na(x) && x) or if(any(is.na(x))) may offer more flexibility and clarity to the reader.

If I were to proceed, my preference would be to think of your new versions as scalar versions of all and any and name them accordingly, to be used like scalar.all(a, b) instead of a && b. Or perhaps if you're thinking about it from the flow control perspective, add a parameter to describe what to do with NA values, flowcontrol.all(a, b, na=c("error", "TRUE", "FALSE").

By the way, I applaud your efforts, in general. Being aware of these issues and writing code to be safe about them seems like a good idea. However, after thinking about this more and trying briefly to write my own versions, I feel that within your own package it would be preferable to do this as needed on a case by case basis. There are times when you'll know as a package writer that one or both of the inputs can only be a scalar, and then this extra code is unnecessary.

Still, here's an attempt at a scalar.all function.

scalar.all <- function(...) {
   ns <- eval(substitute(alist(...)))
   sofar <- TRUE
   for(n in ns) {
      x <- eval(n)
      if (length(x) != 1L) {
        stop(paste(list(n)), " is not a scalar with length 1")
      }
      if (is.na(x)) sofar <- NA
      else if (!x) return(FALSE)
   }
   sofar
}
Aaron left Stack Overflow
  • 36,704
  • 7
  • 77
  • 142
  • 1
    I like how this answer is exactly in the same spirit as your previous one. Follow-up answer for follow-up question. ;-) – Eike P. Apr 29 '15 at 07:50
  • Do you have any idea on how to deal with that operator precedence problem I meantioned in the comments on the question? I guess I could live with the additional parentheses if really necessary, but it kind of bugs me. – Eike P. Apr 29 '15 at 07:53
  • 1
    @jihn: Precedence doesn't appear to be a problem if you redefine `&&` and `||`. And if you don't do it at the command line or in .Rprofile and don't export them, then you will restrict the impact in a sensible manner. – IRTFM Apr 29 '15 at 14:24
  • Edited answer to reflect concern about `NA` and `NULL` values, and also comment about the benefit of this within a package. – Aaron left Stack Overflow Apr 29 '15 at 14:56