4

Similar to another question asked on here, I get the error message in the subject line. I attempted to use it's remedy to fix my problem but I was not able to do so. Here is my code:

#Change the format of IED deaths to be uniform
USdata$Cause[USdata$Cause=="Hostile - hostile fire - IED attack" | USdata$Cause=="Hostile - hostile fire - IED attack (suicide attack)" | USdata$Cause=="Hostile - hostile fire - IED attack (suicide car bomb)" | USdata$Cause=="Hostile - hostile fire - IED attack (while defusing)" | USdata$Cause=="Hostile - hostile fire - IED attack, RPG" | USdata$Cause=="Hostile - hostile fire - IED attack, RPG, small arms fire" | USdata$Cause=="Hostile - hostile fire - IED Attack, small arms fire" | USdata$Cause=="Hostile - hostile fire - IED Attack, small arms fire, indirect fire"] <- "Hostile - IED Attack"

Warning message:
In `[<-.factor`(`*tmp*`, USdata$Cause == "Hostile - hostile fire - IED attack" |  :
 invalid factor level, NA generated

I see when I do a summary of my attempted new value, "Hostile - IED Attack", I get everything returned as NA's. I was able to do something similar with other values but this one isn't working so easily. Thanks.

hrbrmstr
  • 77,368
  • 11
  • 139
  • 205
Sean Sharp
  • 41
  • 1
  • 1
  • 2
  • 4
    When you read the data using `read.table` or `read.csv`, use `stringsAsFactors=FALSE` – akrun Sep 08 '14 at 17:22

1 Answers1

7

Convert it from a factor first, do the change and re-convert it back. Also, %in% might work better for you in the long run:

ied_causes <- c("Hostile - hostile fire - IED attack",
                "Hostile - hostile fire - IED attack (suicide attack)",
                "Hostile - hostile fire - IED attack (suicide car bomb)",
                "Hostile - hostile fire - IED attack (while defusing)",
                "Hostile - hostile fire - IED attack, RPG",
                "Hostile - hostile fire - IED attack, RPG, small arms fire",
                "Hostile - hostile fire - IED Attack, small arms fire",
                "Hostile - hostile fire - IED Attack, small arms fire, indirect fire")

USdata$Cause <- as.character(USdata$Cause)
USdata$Cause[USdata$Cause %in% ied_causes] <- "Hostile - IED Attack"
USdata$Cause <- factor(USdata$Cause)
hrbrmstr
  • 77,368
  • 11
  • 139
  • 205
  • 1
    You don't need to type out every secondary reason. Might be faster to grep main COD. `USdata$Cause[grep("IED attack",USdata$Cause),] <- "Hostile - IED Attack"` – Brandon Bertelsen Sep 08 '14 at 17:32
  • 1
    Without knowing the underlying data, I didn't want to make any assumptions about what the poster has to work with, but you're absolutely spot on if it happens to be uniform like that. Much better approach. – hrbrmstr Sep 08 '14 at 17:34
  • 1
    It wasn't a criticism :) comment is directed at OP. – Brandon Bertelsen Sep 08 '14 at 17:35
  • Aye. I grok'd that (and modified my comment as such :-) – hrbrmstr Sep 08 '14 at 17:35