0

I was trying to optimize my loop but I came across an issue and I havent found any direct solution here. I already checked out other threads like Error in if/while (condition) {: missing Value where TRUE/FALSE needed but it doesnt help me solving my problem I still have the same issue.

This is my code:

output <- character (nrow(df)) # predefine the length and type of the vector
condition <- (df$price < df$high & df$price > df$low)   # condition check outside the loop

system.time({
    for (i in 1:nrow(df)) {
        if (condition[i]) {
            output[i] <- "1"
         }else if (!condition[i]){
           output[i] <- "0"
        }else  {
            output[i] <- NA
        }
    }
    df$output <- output
})


I am basically checking if my price is in a certain range. If its inside the range i assign it a 1 and if its outside the range I assign it a 0. However, I have couple NA values and then my loop stops the moment i reach an NA.

Below you can see the working code if I filter out the NAs. But I would like to have a way which would handle the NAs as well.

df<- df%>% filter(!is.na(price))
output <- character (nrow(df)) # predefine the length and type of the vector
condition <- (df$price < df$high & df$price > df$low)   # condition check outside the loop


system.time({
  for (i in 1:nrow(df)) {
    if (condition[i]) {
      output[i] <- "1"
    }else  {
      output[i] <- "0"
    }
  }
  df$output <- output
})

Any idea how I could handle the NAs?

Newbie
  • 91
  • 1
  • 8

3 Answers3

2

I think you can do :

df$output <- as.integer(df$price < df$high & df$price > df$low)

which would handle all the cases.

For example,

df <- data.frame(price = c(10, 23, NA, 50), high = 25, low = 5)
df$output <- as.integer(df$price < df$high & df$price > df$low)

df
#  price high low output
#1    10   25   5      1
#2    23   25   5      1
#3    NA   25   5     NA
#4    50   25   5      0
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213
  • Thanks thats also a good approach. I was just practicing loops thats why I wanted to do in a loop. – Newbie Jan 24 '20 at 10:26
2

If/else in R doesn't like NAs. You could try this, where you start with checking for the NA condition on the input, and then check for TRUE or FALSE of your condition.

output <- character (nrow(df)) # predefine the length and type of the vector
condition <- (df$price < df$high & df$price > df$low)   # condition check outside the loop

system.time({
    for (i in 1:nrow(df)) {

        if(is.na(condition[i])){
          output[i] <- NA
        }else (condition[i]) {
            output[i] <- "1"
         }else{
           output[i] <- "0"
        }
    }
    df$output <- output
})
Seshadri
  • 669
  • 3
  • 11
0

We can also do

df$output <- +(df$price < df$high & df$price > df$low)
akrun
  • 874,273
  • 37
  • 540
  • 662