0

I am very new in using R and trying to get my hear around different commands. I have this simple code:

setwd("C:/Research") 

tempdata=read.csv("temperature_humidity.csv")

Thour=tempdata$t

RHhour=tempdata$RH
weather=data.frame(cbind(hour,Thour,RHhour))
head(weather)
if (Thour>25) {
y=0 else {
y=3
}

x=Thour+y*2

x

I simply want the code to read the Thour(temperature) from CSV file and if it is higher than 25 then uses y=0 in the formula, if its lower than 25 then uses y=3

I tried ifelse but it doesn't work as well.

Thanks for your help.

Hank
  • 51
  • 1
  • 4

3 Answers3

4

I've said that too many times today already, but avoid ifelse statements as much as possible (very inefficient and unnecessary in most cases), try this instead:

c(3, 0)[(Thour >= 25) + 1]

This solution will return a logical vector of TRUE/FALSE which will be coerced to 0/1 when added to 1 and become 1/2 which will be the indexes in c(3, 0)

Or even better solution (posted by @BondedDust in comments) would be:

3*(Thour <= 25)

This solution will return a logical vector of TRUE/FALSE which will be coerced to 0/1 when multiplied by 3

Benchmark comparison:

 Thour <- sample(1:100000)
 require(microbenchmark)

 microbenchmark(ifel = {ifelse(Thour < 25 , 0 , 3)}, Bool = {3*(Thour >= 25)})
Unit: microseconds
 expr       min        lq    median        uq      max neval
 ifel 38633.499 41643.768 41786.978 55153.050 59169.69   100
 Bool   901.135  1848.091  1972.434  2010.841 20754.74   100
IRTFM
  • 258,963
  • 21
  • 364
  • 487
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
  • @Hank: The first suggestion is more general and would allow selection of either numeric or character vectors. The Boolean algebra approach is only effective when the desired mapping is to {0, n} where n is numeric. I used to think the Boolean algebra appraoch should be faster but in head-to-head comparisons I haven't found `ifelse` to be particularly inefficient, though. – IRTFM Oct 07 '14 at 19:42
  • @BondedDust, I did found `ifelse` inefficient in any benchmark I did. Ever. – David Arenburg Oct 07 '14 at 21:03
  • 1
    I'm not sure whether my memory is failing me or `ifelse` has gotten less efficient, but the benchmark I just ran confirms the efficiency of the Boolean approach by a substantial margin. I hope you don't mind that I am editing your answer to include it. – IRTFM Oct 07 '14 at 21:17
  • @BondedDust, good edit for illustration. I did the same in my [previous answer](http://stackoverflow.com/questions/26239328/conditional-difference-calculation-in-data-table/26243030#26243030), so got lazy to do it again in such short time period – David Arenburg Oct 07 '14 at 21:27
  • I would not have assumed that benchmarks done in a data.table environment would necessarily translate to this situation. – IRTFM Oct 07 '14 at 21:31
  • @BondedDust, I did many other benchmarks before too. I can look it up if you want, but I think that your illustration will suffice for now. – David Arenburg Oct 07 '14 at 21:33
  • A small voice of dissent: While @BondedDust's solution is quite clear, `c(3, 0)[(Thour >= 25) + 1]` is much more opaque--especially to a self-proclaimed new R user like OP. The `ifelse` statement is readable, and if the data is small enough that the performance difference in negligible, I would encourage new users to use the method that is clearest and most readable to them. – Gregor Thomas Oct 07 '14 at 22:05
  • @Gregor perhaps assigning to subsets is more readable? `y <- Thour; y[Thour < 25] <- 0; y[Thour >= 25] <- 3`. It's still slower than the Boolean approach, but much faster than `ifelse` – GSee Oct 08 '14 at 01:11
  • By the way, all of these are wrong. The OP has `if (Thour > 25) 0 else 3`, which is the opposite sign. – GSee Oct 08 '14 at 01:12
1

This should work for you. Just replace what you're naming Thour with the appropriate code.

Thour <- sample(1:100, 1)
Thour
# [1] 8
y <- ifelse(Thour >= 25, 0, 3)
y
# [1] 3

And:

Thour <- sample(1:100, 1)
Thour
# [1] 37
y <- ifelse(Thour >= 25, 0, 3)
y
# [1] 0

You may need to change the logical operator (>=) to match your exact circumstance since it's unclear, which if any of the higher or lower range you want to be inclusive.

n8sty
  • 1,418
  • 1
  • 14
  • 26
0

R has a very flexible syntax. So you can write this in many ways:

# ifelse() function
y <- ifelse(Thour > 25, 0, 3)

# more ifelse()
y <- 3 * ifelse(Thour > 25, 0, 1)

# The simpler way:
y <- 3 * (Thour > 25)

By the way, use <- instead of = for assignment... it's the "preferred" style

Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
Barranka
  • 20,547
  • 13
  • 65
  • 83
  • 1
    The first suggestion will in general not be correct. The third one will be effective but less efficient. – IRTFM Oct 07 '14 at 19:39
  • @BondedDust Thanks for your feedback. AFAIK the last one (`3 * (Thour > 25)` is the most efficient, do you agree? – Barranka Oct 07 '14 at 19:57
  • 1
    I have found the pure `ifelse` to be competitive with Boolean algebra. – IRTFM Oct 07 '14 at 20:21