109

I am a little confused about the switch statement in R. Simply googling the function I get an example as follows:

A common use of switch is to branch according to the character value of one of the arguments to a function.

 > centre <- function(x, type) {
 + switch(type,
 +        mean = mean(x),
 +        median = median(x),
 +        trimmed = mean(x, trim = .1))
 + }
 > x <- rcauchy(10)
 > centre(x, "mean")
 [1] 0.8760325
 > centre(x, "median")
 [1] 0.5360891
 > centre(x, "trimmed")
 [1] 0.6086504

However this just seems to be the same as just having a bunch of if statements designated for each type

Is that all there is to switch()? Can someone give me further examples and better applications?

John Colby
  • 22,169
  • 4
  • 57
  • 69
LostLin
  • 7,762
  • 12
  • 51
  • 73

4 Answers4

120

Well, timing to the rescue again. It seems switch is generally faster than if statements. So that, and the fact that the code is shorter/neater with a switch statement leans in favor of switch:

# Simplified to only measure the overhead of switch vs if

test1 <- function(type) {
 switch(type,
        mean = 1,
        median = 2,
        trimmed = 3)
}

test2 <- function(type) {
 if (type == "mean") 1
 else if (type == "median") 2
 else if (type == "trimmed") 3
}

system.time( for(i in 1:1e6) test1('mean') ) # 0.89 secs
system.time( for(i in 1:1e6) test2('mean') ) # 1.13 secs
system.time( for(i in 1:1e6) test1('trimmed') ) # 0.89 secs
system.time( for(i in 1:1e6) test2('trimmed') ) # 2.28 secs

Update With Joshua's comment in mind, I tried other ways to benchmark. The microbenchmark seems the best. ...and it shows similar timings:

> library(microbenchmark)
> microbenchmark(test1('mean'), test2('mean'), times=1e6)
Unit: nanoseconds
           expr  min   lq median   uq      max
1 test1("mean")  709  771    864  951 16122411
2 test2("mean") 1007 1073   1147 1223  8012202

> microbenchmark(test1('trimmed'), test2('trimmed'), times=1e6)
Unit: nanoseconds
              expr  min   lq median   uq      max
1 test1("trimmed")  733  792    843  944 60440833
2 test2("trimmed") 2022 2133   2203 2309 60814430

Final Update Here's showing how versatile switch is:

switch(type, case1=1, case2=, case3=2.5, 99)

This maps case2 and case3 to 2.5 and the (unnamed) default to 99. For more information, try ?switch

Tommy
  • 39,997
  • 12
  • 90
  • 85
  • 3
    Using a for loop like that can cause issues with garbage collection. The difference is much smaller with a better benchmarking function: `benchmark(test1('trimmed'), test2('trimmed'), replications=1e6)`. – Joshua Ulrich Oct 19 '11 at 18:47
  • @JoshuaUlrich ...which `benchmark` function are you using? Not the obvious one from the "benchmark" package it seems? – Tommy Oct 19 '11 at 18:52
  • 1
    According to http://stackoverflow.com/questions/6262203/function-in-r-to-measure-function-execution-time/6262540#6262540 "microbenchmark" is an even better one. – Tommy Oct 19 '11 at 19:26
  • @JoshuaUlrich - I updated the answer with results from `microbencmark`, but they are very similar to my original ones. I don't really see how rbenchmark would get around the GC issue, but it seems to have more overhead by calling `eval` and `replicate`. – Tommy Oct 19 '11 at 19:34
  • just as am aside can I have multiple cases with the same output? i.e. `switch(type, c(this,that)=do something)` – LostLin Oct 20 '11 at 14:04
  • @Ellipsis... Yes, `switch(type, case1=1, case2=, case3=2.5, 99)` maps `case2` and `case3` to 2.5 and the (unnamed) default to `99`. – Tommy Oct 20 '11 at 15:50
4

In short, yes. But there are times when you might favor one vs. the other. Google "case switch vs. if else". There are some discussions already on SO too. Also, here is a good video that talks about it in the context of MATLAB:

http://blogs.mathworks.com/pick/2008/01/02/matlab-basics-switch-case-vs-if-elseif/

Personally, when I have 3 or more cases, I usually just go with case/switch.

John Colby
  • 22,169
  • 4
  • 57
  • 69
1

Switch can also be much easier to read than a series of if() statements. How about:

switch(id,
   "edit" = {
      CODEBLOCK
   },
   "delete" = {
      CODEBLOCK
   },
   stop(paste0("No handler for ", id))
 )
3D0G
  • 797
  • 8
  • 15
1

The difference between switch and if-else is mainly a stylistic choice. In general, the language-independent (c++, c#) rule-of-thumb is to use switch when there are roughly 3-5 or more conditions to improve code readability. That said, there are some differences between switch and if in R that make it more than branching thingie.


Character input
> Most often used
> Example case: small dictionary.
f <- function(x) switch(x, a = 100, b = 200, c = 300, 0)
g <- function(x) if(x == "a") 100 else if(x == "b") 200 else if(x == "c") 300 else 0
You can supply the Expr argument as a regular strings, in backticks or as names, or empty value. An empty element is falling through, meaning that the next will be evaluated. If no match is found, the default is returned.

chr_switch <- function(expr) {
    switch(
        expr,
        "a" = 1,
        b =,
        `c` = 3,
        4
    )
}

sapply(c("a", "b", "c", "d"), chr_switch)
#> a b c d 
#> 1 3 3 4 

Numeric input
> Not used as often
> Example case 1: if n arguments are given, do that expression
f <- function(...) switch(...length(), 100, 200, 300)
g <- function(...) {l = ...length(); if(l == 1) 100 else if(l == 2) 200 else 300}

num_switch <- function(expr) switch(expr, "one","two","three","four")
sapply(1:3, num_switch)
#> [1] "one"   "two"   "three"

num_switch_empty <- function(expr) switch(expr, "one",,"three","four")
sapply(c(1, 3), num_switch_empty) # works. Note the numeric, not an integer!
#> [1] "one"   "three"
sapply(c(1,2,3), num_switch_empty) # error because 2 is empty

From the example above we see that type coercion takes place (numeric & integer). This coercion is not identical to as.integer, see switch(as.raw(1), 100). if does not coerce. Consider a factor,

fct <- as.factor(c("a", "b"))
switch(fct[1], "one", "two") # one + warning
switch(fct[2], "one", "two") # two + warning
if(fct[1]) "one" else "two" # one
if(fct[2]) "one" else "two" # one

Performance

A benchmark1 for increasing n number of nested expressions. switch outperforms nested if quite early on, but overall the performance difference is negligible, even if you somehow write 150 (!) nested if calls. Note that these are for single calls, not for vectors of inputs. The vectorized version of if is ifelse. vectorized switch is implement in the kit package.

make_if <- Vectorize(function(x) {
    paste(paste0(rep("if(TRUE)", x), collapse = " "), "1L")
})
make_if(10)
#> "if(TRUE) if(TRUE) if(TRUE) if(TRUE) if(TRUE) if(TRUE) if(TRUE) if(TRUE) if(TRUE) if(TRUE) 1L"
make_switch <- Vectorize(function(x) {
    paste0("switch(", x, paste0(replicate(x, ","), collapse = ""), 1L, ")")
})
make_switch(10) 
[1] "switch(10,,,,,,,,,,1)"

transforming into expressions, and calling bench::mark(iterations = 1e4) gives enter image description here

1 don't bet your life on sub-second bench marks

Donald Seinen
  • 4,179
  • 5
  • 15
  • 40