1

I have a dataset with lots of numerical variables, and a character variables that says whether or not low values are suppressed for that observation. In observations where values aren't suppressed, I want to replace NAs 0s (just for specific variables), and I can't figure it out. This is my data:

suppressed var1 var2
      none    2    6
      none   NA    6
      none    3    7
      none   NA   NA
      full    2    6
      full    3    6
      none    3   NA
      partial NA    6
      none    2    7
      none    NA   NA

What I want to do is change NA to 0 in Var 1, if Suppressed=none. I tried

df$Var1<-if (df$suppressed=='none'&is.na(df$Var1)) 0 
         else df$Var1

and got

Error in if (df$suppressed == "none" & is.na(df$Var1)) 0 else df$Var1 : 
  argument is of length zero

Is there something wrong with my if else statement, or is there another way to do this?

Here's the structure of my data:

structure(list(suppressed = structure(c(2L, 2L, 2L, 2L, 1L, 1L, 2L, 3L, 2L, 2L), .Label = c("full", "none", "partial"), class = "factor"), var1 = c(2, NA, 3, NA, 2, 3, 3, NA, 2, NA), var2 = c(6, 6, 7, NA, 6, 6, NA, 6, 7, NA)), .Names = c("suppressed", "var1", "var2"), row.names = c(NA, -10L), class = "data.frame")
David Arenburg
  • 91,361
  • 17
  • 137
  • 196
user5457414
  • 137
  • 3
  • 7

1 Answers1

3

ifelse() can do the trick. It takes three arguments - the condition, the if output, and the else output.

df$Var1 <- ifelse(df$suppressed == 'none' & is.na(df$Var1), 0, Var1)   

You don't have any curly brackets in your code {}.

if(df$suppressed == 'none' & is.na(df$Var1)){
    df$Var1 <- 0
   }else{
    df$Var1 <- df$Var1
}

Hope this helps.

Raphael K
  • 2,265
  • 1
  • 16
  • 23
  • There is absolutely no need in `ifelse` when you only have one condition to fulfill. Why would you need to put `Var1` into the *else* part if it is already there? – David Arenburg May 03 '16 at 19:40
  • I totally agree, but seeing as how he was trying to use an if-statement, I wanted to show him ifelse(). But, I can't argue with you. – Raphael K May 03 '16 at 19:41
  • Also, `df$var1 == NA` isn't doing what you think it is doing. So your second option won't work. – David Arenburg May 03 '16 at 19:42
  • Why doesn't it work? It tests a logical condition, no? – Raphael K May 03 '16 at 19:44
  • 1
    No. R doesn't know if something is equals to `NA` or not because it doesn't know what's `NA` is equals to. This is why we have the `is.na` function. So anything that is being tested against `NA` will return `NA`. Try `c(1, NA, "h") == NA`, for instance. Or even just `NA == NA`. It is always better to test your code in general. – David Arenburg May 03 '16 at 19:46
  • Thank you! Learned something today. – Raphael K May 03 '16 at 19:48
  • 1
    Though I agree that your answer is useful for more general cases (+1). But here something else you might want to be aware of http://stackoverflow.com/questions/16275149/does-ifelse-really-calculate-both-of-its-vectors-every-time-is-it-slow. – David Arenburg May 03 '16 at 19:51