1

Here is a minimal reproducible example that generates the error:

 comb3 <- function(x) {
      if (x == "Unable to do") {
        x = 0
      } 
    } 

Here is my original function:

 comb <- function(x) {
      if (x == "Unable to do") {
        x = 0
      } else if (x == "Very difficult to do") {
        x = 1
      } else if (x == "Somewhat difficult to do") {
        x = 2
      } else if (x == "Not difficult") {
        x = 3
      } 
    }

I am trying to use this function on a column sampled below. I get this error:

Warning messages:
1: In if (x == "Unable to do") { :
  the condition has length > 1 and only the first element will be used
2: In if (x == "Very difficult to do") { :
  the condition has length > 1 and only the first element will be used

Here is a sample of what the data in one column looks like:
sample <- c("Unable to do", "Somewhat difficult to do", "Very difficult to do", "Unable to do", "Not difficult","Unable to do","Very difficult to do", "Not difficult", "Unable to do", "Not difficult")        
Krishna
  • 25
  • 4
  • So, you want to modify a column based on its name? – slava-kohut Oct 22 '19 at 15:25
  • It looks like you want `case_when` instead if you are using `mutate_at`. But you should show your `dplyr` code as well and include some sample data to make a proper [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) – MrFlick Oct 22 '19 at 15:26
  • I am getting the error when performing function(column) on a single column without using the dplyr function that is why I didn't include it. I haven't attempted to use this function with dplyr yet. @MrFlick – Krishna Oct 22 '19 at 15:48
  • @slava-kohut Actually the sample I provided is one column. Let me clarify that. – Krishna Oct 22 '19 at 15:49
  • @Krishna `if()` is not a vectorized control flow statement. If i have `x<-c(3,10)` and I do `if(x>5) print("ok")`, this will not work because there are two values of `x` and some of the values are less than 5 and some greater. One `if` statement doesn't work in that case. You need `ifelse()` or `if_else` or `case_when` since functions operate on all rows of a data.frame at a time. – MrFlick Oct 22 '19 at 15:51
  • @MrFlick Ok that makes a lot of sense. I will investigate those three options. Thank you so much for your guidance! – Krishna Oct 22 '19 at 16:00
  • @MrFlick Using case_when in my function worked perfectly! Thanks again. – Krishna Oct 22 '19 at 16:06

1 Answers1

0

The warning message describes the issue with your code fairly well. if is a function which expects a length one logical vector as input. Thus to use conditionals over a vector, you should instead use something like ifelse, or as MrFlick said, to use case_when or mutate_at.

An equivalent version of your function using ifelse would be something like this:

comb1 <- function(x) {
  ifelse(x == "Unable to do", 
    0,
    ifelse (x == "Very difficult to do",
      1,
      ifelse(x == "Somewhat difficult to do",
        2,
        ifelse(x == "Not difficult",
          3,
          ## If not match then NA
          NA
        )
      )
    )
  )
}

Note that this is very difficult to read, as the ifelse calls are chained together. You could therefore avoid this by using a slightly modified version of your function in a call to sapply to accomplish the same thing

comb2 <- function(x) {
  sapply(x, function(y) {
    if (y == "Unable to do") {
      0
    } else if (y == "Very difficult to do") {
      1
    } else if (y == "Somewhat difficult to do") {
      2
    } else if (y == "Not difficult") {
       3
    }
  ## USE.NAMES = FALSE means that the output is not named, and has no other effect
  }, USE.NAMES = FALSE)
}

You could also use factors, which are internally coded as integers starting at 1, and (ab)use this to convert from strings to numbers:

comb3 <- function(x) {
  fac <- factor(x, 
    levels = c(
      "Unable to do",
      "Very difficult to do",
      "Somewhat difficult to do",
      "Not difficult"
    )
  )
  as.numeric(fac) - 1
}

The output of these 3 versions is identical, and is a great example of how there can be many ways to accomplish things in R. This can sometimes be a curse rather than a gift.

sample <- c("Unable to do", "Somewhat difficult to do", "Very difficult to do", "Unable to do", "Not difficult","Unable to do","Very difficult to do", "Not difficult", "Unable to do", "Not difficult")
comb1(sample)
# [1] 0 2 1 0 3 0 1 3 0 3
comb2(sample)
# [1] 0 2 1 0 3 0 1 3 0 3
comb3(sample)
# [1] 0 2 1 0 3 0 1 3 0 3
alan ocallaghan
  • 3,116
  • 17
  • 37
  • Thank you for the detailed answer aocall and demonstrating the results of each option. I will look into their implementation! – Krishna Nov 05 '19 at 16:00