0

I have a dataframe in which I want to create a new column with 0/1 (which would represent absence/presence of a species) based on the records in previous columns. I've been trying this:

update_cat$bobpresent <- NA #creating the new column

x <- c("update_cat$bob1999", "update_cat$bob2000", "update_cat$bob2001","update_cat$bob2002", "update_cat$bob2003", "update_cat$bob2004", "update_cat$bob2005", "update_cat$bob2006","update_cat$bob2007", "update_cat$bob2008", "update_cat$bob2009") #these are the names of the columns I want the new column to base its results in

bobpresent <- function(x){
  if(x==NA)
    return(0)
  else
    return(1)
} # if all the previous columns are NA then the new column should be 0, otherwise it should be 1

update_cat$bobpresence <- sapply(update_cat$bobpresent, bobpresent) #apply the function to the new column

Everything is going fina until the last string where I'm getting this error:

Error in if (x == NA) return(0) else return(1) : 
  missing value where TRUE/FALSE needed

Can somebody please advise me? Your help will be much appreciated.

Fred Foo
  • 355,277
  • 75
  • 744
  • 836
Cat
  • 1
  • 3

1 Answers1

3

By definition all operations on NA will yield NA, therefore x == NA always evaluates to NA. If you want to check if a value is NA, you must use the is.na function, for example:

> NA == NA
[1] NA
> is.na(NA)
[1] TRUE

The function you pass to sapply expects TRUE or FALSE as return values but it gets NA instead, hence the error message. You can fix that by rewriting your function like this:

bobpresent <- function(x) { ifelse(is.na(x), 0, 1) }

In any case, based on your original post I don't understand what you're trying to do. This change only fixes the error you get with sapply, but fixing the logic of your program is a different matter, and there is not enough information in your post.

janos
  • 120,954
  • 29
  • 226
  • 236
  • Thank you both for your suggestions. None of them work though. It successfully converts everything in 0 without distinguishing the columns with other data (like 1,2 - these are number of animals captured). Any idea what's happening? – Cat Apr 25 '13 at 19:29
  • You need to give us a small sample of your data so we know what sort of values (and their class) are in each column. Also, your function doesn't refer to the outside variable `x` but rather to the variable you've named in your `sapply` call. My guess is you want to do something else. Can you write out pseudocode in some loop so we know what should be happening? – Carl Witthoft Apr 25 '13 at 20:05