14

Yes I know, there have been a number of questions (see this one, for example) regarding the usage of & vs. && in R, but I have not found one that specifically answers my question.

As I understand the differences,

  • & does element-wise, vectorised comparison, much like the other arithmetic operations. It hence returns a logical vector that has length > 1 if both arguments have length > 1.
  • && compares the first elements of both vectors and always returns a result of length 1. Moreover, it does short-circuiting: cond1 && cond2 && cond3 && ... only evaluates cond2 if cond1 is TRUE, and so forth. This allows for things like if(exists("is.R") && is.function(is.R) && is.R()) and particularly means that using && is strictly necessary in some cases.

Moreover, if issues the warning

the condition has length > 1 and only the first element will be used

if its condition has more than one element.

Judging from these preliminaries, I'd consider it safer to prefer & to && in all if statements where short-circuiting isn't required.

If something went wrong during calculations and I accidentally have a vector in one of &'s arguments, I get a warning, which is good. If not, everything is fine as well.

If, on the other hand, I used &&, and something went wrong in my calculations and one of &&'s arguments is a vector, I don't get a warning. This is bad. If, for some reason, I really want to compare the first elements of two vectors, I'd argue that it's much cleaner to do so explicitly rather than implicitly.

Note that this is contrary to what seems to be common agreement between R programmers and contrary to what the R docs recommend. (1)

Hence my question: Are there any reasons except short-circuiting that make && preferable to & in if statements?


(1) Citing help(&&):

'&' and '&&' indicate logical AND and '|' and '||' indicate logical OR. The shorter form performs elementwise comparisons in much the same way as arithmetic operators. The longer form evaluates left to right examining only the first element of each vector. Evaluation proceeds only until the result is determined. The longer form is appropriate for programming control-flow and typically preferred in 'if' clauses.

Community
  • 1
  • 1
Eike P.
  • 3,333
  • 1
  • 27
  • 38
  • 1
    I generally agree with how you've laid it out. One note: `&&` is slightly faster (though likely not to matter). I mostly use `&&` due to the short circuiting it allows. I also wish R had scalar data types. Not sure what you mean by "bitwise" comparison; I don't think that applies for these operators. – BrodieG Apr 27 '15 at 14:56
  • @BrodieG regarding "bitwise": Truth be told I took that expression from [this answer](http://stackoverflow.com/a/6933612/2207840). I guess you're right and it doesn't apply here; I'll remove it in a second to avoid confusion. – Eike P. Apr 27 '15 at 15:09

2 Answers2

6

No, using && does not offer any advantages other than short-circuiting.

However, short-circuiting is very much preferable for control flow, so much so that it should be the default. if statements should not take vectorised arguments - that's what ifelse is for. If you are passing a logical vector into if typically you would be contracting it to a single logical value using any or all for the evaluation.

The major advantages of short circuiting are in avoiding lengthy or failure-prone steps (eg internet connections - though these should be dealt with through try):

#avoiding lengthy calculations
system.time(if(FALSE & {Sys.sleep(2);TRUE}) print("Hello"))
   user  system elapsed 
   0.00    0.00    1.99 
system.time(if(FALSE && {Sys.sleep(2);TRUE}) print("Hello"))
   user  system elapsed 
      0       0       0 

#avoiding errors
if(FALSE & {stop("Connection Failed");TRUE}) print("Success") else print("Condition not met")
Error: Connection Failed
if(FALSE && {stop("Connection Failed");TRUE}) print("Success") else print("Condition not met")
[1] "Condition not met"

It is clear that in order to take advantage of these features, you would have to know in advance which steps take the longest or are prone to errors and construct the logical statement appropriately.

James
  • 65,548
  • 14
  • 155
  • 193
  • 1
    Well, I specifically asked for *reasons except short-circuiting*... Of course, in the cases you mentioned it is clearly preferable / necessary to use `&&`, but personally, I have never come across such a case in my code. That's why I asked for other reasons. – Eike P. Apr 27 '15 at 15:33
  • Moreover, I never argued for using `if` with vectorised arguments; on the contrary: "I'd argue that it's much cleaner to do so explicitly rather than implicitly." – Eike P. Apr 27 '15 at 15:35
  • @jhin Yes, I know you asked except for short-circuiting, but the question didn't seem to appreciate its value. In general you shouldn't evaluate arguments if you do not need to. The warning you get when using `&`, is simply that - the rest of the program will carry on regardless. Depending on your settings and the context of the execution you may not even see it. `&&` will only evaluate the first element of the vector anyway. – James Apr 27 '15 at 15:48
  • "In general you shouldn't evaluate arguments if you do not need to." - I certainly agree with that! And I'd much prefer such a 'lazy' (is that the correct term?) operator *that also issues a warning when applied to vectors of length > 1*. – Eike P. Apr 27 '15 at 15:54
  • 2
    "`&&` will only evaluate the first element of the vector anyway." - Yep, and that's exactly what I dislike about it. It doesn't even issue a warning but carries on silently, although in many cases either a scalar or a vector will indicate an error in a previous calculation. – Eike P. Apr 27 '15 at 16:03
  • We certainly agree that using `any` and `all` everywhere possible is preferrable, by the way. ;-) – Eike P. Apr 27 '15 at 16:05
  • @jhin Well, I'm glad we agree on something ;) I think you are overstating the value of getting a warning though. Would it be preferable to preface the answer with the rider that there are no other reasons than short-circuiting. – James Apr 27 '15 at 16:55
  • @James If you think that that's the case, then yes, I'd prefer that! :-) sorry if I overreacted here; I'll undo the downvote asap. I'm currently on a mission to make my R programming style as safe as possible, also see my other questions. That's why I may be asking questions that seem a bit weird. ;-) – Eike P. Apr 27 '15 at 17:09
6

Short answer: Yes, the different symbol makes the meaning more clear to the reader.

Thanks for this interesting question! If I can summarize, it seems to be a follow-up specifically about this section of my answer to the question you linked,

... you want to use the long forms only when you are certain the vectors are length one. You should be absolutely certain your vectors are only length 1, such as in cases where they are functions that return only length 1 booleans. You want to use the short forms if the vectors are length possibly >1. So if you're not absolutely sure, you should either check first, or use the short form and then use all and any to reduce it to length one for use in control flow statements, like if.

I hear your question (given comments) this way: But & and && will do the same thing if the inputs are length one, so other than short-circuiting, why prefer &&? Perhaps & should be preferred because if they're not length one, if will give me a warning, helping me be even more certain that the inputs are length one.

First, I agree with the comment by @James that you may be "overstating the value of getting a warning"; if it's not length one, the safer thing will be to handle this appropriately, not to just plow ahead. You could make a case that && should throw an error if they're not length one, and perhaps a good case; I don't know the reason why it does what it does. But without going back in time, the best we can do now is to check that the inputs are indeed appropriate for your use.

Given then, that you have checked to make sure your inputs are appropriate, I would still recommend && because it semantically reminds me as the reader that I should be making sure the inputs are scalars (length one). I'm so used to thinking vector-ally that this reminder is helpful to me. It follows the principle that different operations should have different symbols, and for me, a operation that is meant for use on scalars is different enough than a vectorized operation that it warrants a different symbol.

(Not to start a flame war (I hope), but this is also why I prefer <- to =; one for assigning variables, one for setting parameters to functions. Although deep down this is the same thing, it's different enough in practice to make the different symbols helpful to me as a reader.)

Community
  • 1
  • 1
Aaron left Stack Overflow
  • 36,704
  • 7
  • 77
  • 142
  • That's an interesting point that you make here! I see it a bit ambivalently. While I do agree that one should have (also visually) different operators for scalars and vectors, what hurts me is the fact that `&` and `&&` are not *different enough* in that they both happily accept vectors. Obviously, this is a problem with R, not with your argument. – Eike P. Apr 28 '15 at 09:42
  • 1
    Ah, I think I can explain even better in response to your comment. The different symbol tells me that I *should be* dealing with scalars, so whenever I see this I'm reminded to make sure I've got the proper checks in place. Answer edited slightly to reflect. – Aaron left Stack Overflow Apr 28 '15 at 14:58