1

I am trying to write a simple if statement. I have data like this

Number  Description        Type
1       Snow in road       "weather"
2       Ice on Roof        "Weather"
3       Dog bite           "Not Classified"

Basically trying to do this:

if(data$type == "Not Classified") {sapply(data$description, colcheck)} else "Not Classified"

The desired result would be for the function that I have stated previously in my code to run on row 3, the "Not Classified" row. For some reason, I keep getting the same error:

"the condition has length > 1 and only the first element will be used".

colcheck is a function created previously. I have tried IfElse, taking out the else at the end, and adding a do in front of the function, but they aren't working. I am trying to filter the data to only use the function on the rows where type == "Not classified". Thank you

divibisan
  • 11,659
  • 11
  • 40
  • 58
Reagan
  • 49
  • 1
  • 5
  • Maybe: `data$newColumn <- ifelse(data$type == "Not Classified", colcheck(data$description), "Not Classified")` – zx8754 Jul 24 '18 at 16:17
  • Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269). This will make it much easier for others to help you. – zx8754 Jul 24 '18 at 16:17

2 Answers2

0

The problem is that data$type is a vector of length > 1. The == operator compares a single value only and when you pass it a vector, it takes only the first value, rather than failing.

What you want to do is use apply or dplyr::mutate to apply the test to each element of data$type:

data <- data.frame('Number' = c(1,2,3),
                   'Description' = c('Snow in road', 'Ice on Roof', 'Dog bite'),
                   'Type' = c('weather', 'Weather', 'Not Classified'))

  Number  Description           Type
1      1 Snow in road        weather
2      2  Ice on Roof        Weather
3      3     Dog bite Not Classified

Example function for colcheck:

colcheck <- function(x) return(paste0('x',x,'x'))

Using base apply:

apply(data, 1, function(row) {
    if (row['Type'] == 'Not Classified') {
        return(colcheck(row['Description']))
    } else {
        return(row['Description'])
    }
})

[1] "Snow in road" "Ice on Roof"  "xDog bitex"  

Or with dplyr:

library(dplyr)
data %>%
    mutate('colcheck' = if_else(Type == 'Not Classified',
                                colcheck(Description),
                                paste(Description)))

  Number  Description           Type     colcheck
1      1 Snow in road        weather Snow in road
2      2  Ice on Roof        Weather  Ice on Roof
3      3     Dog bite Not Classified   xDog bitex
divibisan
  • 11,659
  • 11
  • 40
  • 58
  • I used the apply and it is working but it is working on everything. It is changing all the already classified rows to "Not Classified" if they are not "Not Classfied". – Reagan Jul 24 '18 at 16:59
  • What do you want it to do? In the example you gave, if `type == 'Not Classified'` then if runs `colcheck` and returns the result. Otherwise, it returns the string `'Not Classified'`. That's what this does. What do you want it to do if `type != 'Not Classified'`? – divibisan Jul 24 '18 at 17:26
  • I want it to do nothing to those rows. Sorry, I want it to return "Not Classified" if the function returns false. I tried to make the return "data$colcheck" (per your example), but it is returning something odd. – Reagan Jul 24 '18 at 17:36
  • Then just change what it returns if `data$type == 'Not Classified'` is `FALSE`. I've changed it above so in that case, it just returns the value of the `description` column unchanged. – divibisan Jul 24 '18 at 17:40
  • I was referring to the data frame you created above with the colcheck column. I am putting in the dplyr function and I am still getting the same error, "the condition has length > 1 and only the first element will be used" – Reagan Jul 24 '18 at 18:03
  • I tried it again with the exact data you used in your example and it seems to work fine. Try restarting R (unexpected objects lurking in the global environment are a common cause of R problems), make sure you have `dplyr` installed and loaded, and make sure capitalization matches (R is case sensitive) – divibisan Jul 24 '18 at 18:10
  • 1
    @Reagan Did you ever solve your problem? If you found a different solution, you can post that as your own answer, – divibisan Jul 30 '18 at 19:16
0

Instead of using functions, I just did one large ifelse statement for each classification. For example,

ifelse(data$type = grepl('Snow| Ice | Rain',data$description, ignore.case = TRUE)),"Weather","Not Classified")

And another for Nonweather, creating a new column of their results.

ifelse(data$type1 = grepl('Dog Bite| Stinky House| Mice Infestation',data$description, ignore.case = TRUE)),'Non-Weather',"Not Classified")

And then, since non-weather classification is dependent on the weather classification, I made this simple if statement to accept whatever the weather result was, and then move on to non-weather words, all in a new column adjacent to the others.

data$Type2<-ifelse(data$Type == "Weather", 
"Weather",data$Type1)

And then I deleted "Type1" and "Type" column and changed the name of "type 2" to just "Type" to keep the results I wanted.

It may be lengthy or sloppy, but it did the trick!

Reagan
  • 49
  • 1
  • 5