51
x <- seq(0.1,10,0.1)
y <- if (x < 5) 1 else 2

This gives a warning (or error since R version 4.2.0) that the condition has length > 1.

I would want the if to operate on every single case instead of operating on the whole vector. What do I have to change?

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
Christian
  • 25,249
  • 40
  • 134
  • 225
  • Is this also possible with an `if (condition){}` else (condition){} construction? If the yes no arguments get a little trickier, it´s sometimes hard to read. I had the same problem like Christian, used if else just as suggested here which works just fine, but looks ugly. So far I am using expression({yes}) which is fine as a work around, but still I wonder if there´s a to do it with if and else. – Matt Bannert Dec 06 '10 at 09:17

6 Answers6

68
x <- seq(0.1,10,0.1)

> x
  [1]  0.1  0.2  0.3  0.4  0.5  0.6  0.7  0.8  0.9  1.0  1.1  1.2  1.3  1.4  1.5
 [16]  1.6  1.7  1.8  1.9  2.0  2.1  2.2  2.3  2.4  2.5  2.6  2.7  2.8  2.9  3.0
 [31]  3.1  3.2  3.3  3.4  3.5  3.6  3.7  3.8  3.9  4.0  4.1  4.2  4.3  4.4  4.5
 [46]  4.6  4.7  4.8  4.9  5.0  5.1  5.2  5.3  5.4  5.5  5.6  5.7  5.8  5.9  6.0
 [61]  6.1  6.2  6.3  6.4  6.5  6.6  6.7  6.8  6.9  7.0  7.1  7.2  7.3  7.4  7.5
 [76]  7.6  7.7  7.8  7.9  8.0  8.1  8.2  8.3  8.4  8.5  8.6  8.7  8.8  8.9  9.0
 [91]  9.1  9.2  9.3  9.4  9.5  9.6  9.7  9.8  9.9 10.0

> ifelse(x < 5, 1, 2)
  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
 [38] 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
 [75] 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
Roman Luštrik
  • 69,533
  • 24
  • 154
  • 197
16

For completeness: In big vectors, you can use the indices to speed things up (we do that often in simulations, where functions typically run 1000 to 10000 times). But as long as it isn't necessary, just use ifelse. This reads a lot easier.

> set.seed(100)
> x <- runif(1000,1,10)

> system.time(replicate(10000,{
+     y <- ifelse(x < 5,1,2)
+ }))
   user  system elapsed 
   2.56    0.08    2.64 

> system.time(replicate(10000,{
+   y <- rep(2,length(x))
+   y[x < 5]<- 1
+ }))
   user  system elapsed 
   0.48    0.00    0.48 
Joris Meys
  • 106,551
  • 31
  • 221
  • 263
  • 3
    You can cut that time even further. My machine did the second method in 0.436 (although it was slower on the first method), but this improved it by another 200%: system.time(replicate(10000,{ y <- (y < 5) + 2*!(y<5) })) user system elapsed 0.101 0.021 0.128 – IRTFM Oct 31 '10 at 04:31
  • @Dwin: Very nice solution! thx. But on my machine, it runs only marginally faster (0.47 compared to 0.48) – Joris Meys Oct 31 '10 at 19:59
  • 6
    Careful - your two examples are not equivalent if `x` contains `NA` elements (which would remain `NA` in the first but would be assigned `1` for the second). – jbaums Jul 13 '14 at 09:31
  • 1
    @jbaums That's correct. Adding an extra line y[is.na(x)] <- NA still keeps that solution a factor 2 faster than ifelse() – Joris Meys Jul 14 '14 at 10:23
15

y <- if (x < 5) 1 else 2 does not operate on the whole vector (the warning you receive tells you only the first element of the condition will be used). You want ifelse:

y <- ifelse(x < 5, 1, 2)

ifelse operates on the whole logical vector, element-by-element. if only accepts one logical value. See ?"if" and ?ifelse

Joshua Ulrich
  • 173,410
  • 32
  • 338
  • 418
4

You could also just create a logical vector and 1 to it

x <- seq(0.1, 10, 0.1) # Your data set   
(x >= 5) + 1
#  [1] 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
# [92] 2 2 2 2 2 2 2 2 2

If would like to compare performance, it would be the fastest solution

set.seed(100)
x <- runif(1e6, 1, 10)

RL <- function(x) y <- ifelse(x < 5,1,2)
JM <- function(x) {y <- rep(2, length(x)); y[x < 5] <- 1}
DA <- function(x) y <- (x >= 5) + 1

library(microbenchmark)
microbenchmark(RL(x),
               JM(x),
               DA(x))

# Unit: milliseconds
#  expr       min        lq      mean    median        uq       max neval
# RL(x) 331.83448 366.52940 378.89182 374.99741 381.08659 609.21218   100
# JM(x)  38.72894  42.18745  44.36493  43.25086  44.09626  82.76168   100
# DA(x)  10.01644  11.96482  14.21593  13.17825  14.12930  53.76923   100
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
  • For a _slightly_ more efficient approach you can modify your code to `function(x) y <- (x >= 5L) + 1L` but generally a nice answer and interesting to see how slow, in comparison, `ifelse` is. – talat Dec 30 '14 at 14:21
0

Following the above post you can even use and modify the elements of a vector satisfying the criteria. In my opinion if it's not more costly to compute faster one should always do it.

x = seq(0.1,10,0.1)
y <- rep(2,length(x))
y[x<5] <- x[x<5]*2

The code of the previous post is best to answer the question. But if I had to use the code above I would do:

x = seq(0.1,10,0.1)
y <- rep(2,length(x))
y[x<5] <- x[x<5]*0 +1
DJJ
  • 2,481
  • 2
  • 28
  • 53
0
nzMean <- function(x) { mean(x[x!=-1],na.rm=TRUE)}

nzMin <- function(x) {min(x[x!=-1],na.rm=TRUE)}

nzMax <- function(x) { max(x[x!=-1],na.rm=TRUE)}

nzRange<-function(x) {nzMax(x)-nzMin(x)}

nzSD <- function(x) { SD(x[x!=-1],na.rm=TRUE)}

#following function works
nzN1<- function(x) {ifelse(x!=-1,(x-nzMin(x))/nzRange(x) ,x) }

#following is bad as it returns only 4 not 5 elements of vector
nzN2<- function(x) {ifelse(x!=-1,(x[x!=-1]-nzMin(x))/nzRange(x) ,x) }

#following is bad as it returns 5 elements of vector but not correct answer
nzN3<- function(x) {ifelse(x!=-1,(x[x!=-1]-nzMin(x))/nzRange(x) ,-1) }

y<-c(1,-1,-20,2,4)
a<-nzMean(y)
b<-nzMin(y)
c<-nzMax(y)
d<-nzRange(y)
# test the working function
z<-nzN1(y)

print(z)
Arun
  • 11
  • 1