0

I have a string variable that has "(Null Value)" for cases that are missing data. I want to recode "(Null Value)" to just be missing, and not say "(Null Value)". I'm trying to write a loop to get rid of these (Null Value) entries and I keep getting an error "Error: unexpected '}' in "}"

for (row in data){
  if terminate_reason == "(Null Value)"
    recode(data$terminate_reason, "(Null Value)" = NA)
}

wtf does R think there's an extra curly bracket in there? PS-I expect I'll get other errors after solving this one because I'm new to R and have no idea what I'm doing, but I can't get past this one.

nusbaume
  • 27
  • 1
  • 1
  • 6
  • 2
    `if (terminate_reason == "(Null Value)")`. The `if` condition needs to be wrapped in parentheses. – eipi10 Oct 27 '16 at 18:25
  • But then why use a loop. You could just do `recode(data$terminate_reason, "(Null Value)" = NA)`. – eipi10 Oct 27 '16 at 18:28
  • You set up `row` as the counter but did not use it in the loop. – Pierre L Oct 27 '16 at 18:29
  • From the syntax, this may be the better route `is.na(data$terminate_reason) <- data$terminate_reason == "(Null Value)"`. Or you could take care of it when you first read your data in `read.csv(myfile, na.strings="(Null Value)")` – Pierre L Oct 27 '16 at 18:31
  • @eipi10, please post your comment as answer - it's the correct answer to the OP's question. I'm not sure whether to close this as "typographic error" - rather, it's "basic syntax issue", and as such as **might** actually be useful to another newbie (although the universe of syntax errors that could lead to this error is large ...) – Ben Bolker Oct 27 '16 at 18:34

1 Answers1

3

Your code has a number of issues:

  1. The if statement would need enclosing {} symbols.
  2. The if also needs enclosing () symbols around the logical expression.
  3. The result of the recode function is not stored anywhere (the function does not change values-in-place).
  4. Loops are a poor solution to this problem.

It would be much simpler to take advantage of R's natural vectorization. Rather than an if inside a for loop, you can this all in one line:

data$terminate_reason[data$terminate_reason == '(Null Value)'] <- NA

That should do the trick, but ensure that the "terminate_reason" column is character, not factor.

jdobres
  • 11,339
  • 1
  • 17
  • 37
  • 1
    actually, testing equality in this way should work OK for a factor ... factors will be coerced to character for comparison with a character. – Ben Bolker Oct 27 '16 at 18:40
  • Actually you're right. My worry was that, if you try to assign some new value into a factor column that wasn't part of the original factor scheme, it'll fail. But since the goal here is to replace with NA, it shouldn't matter. – jdobres Oct 27 '16 at 19:05