I need to produce some new factor variables in my dataset which contain information from existing factor variables.
In the first case I need to produce a binary NewVariable based on whether certain values occur in a specific variable which has more than 100 levels. I use the revalue() from the plyr package Namely,
NewVar <- if(OldVar1=="helen" | OldVar1=="greg")
{NewVar <-revalue(OldVar1, c("helen"="participant", "greg"="participant"))}
else {NewVar=="nonparticipant"}
I actually want to collapse specific levels into a specific level from the new variable. As you can imagine the above code does not work but I cannot figure out why.
In the second case I need to combine information from three existing factor variables (OldVar1, OldVar2, OldVar3) in order to fill in the levels of a multi-categorical NewVariable, I run this code,
NewVariable="OptionA" <- if(OldVar1=="a" & OldVar2=="b" & OldVar3=="c")
I get an error "Error: unexpected '=' in "OldVar=" the same occurs when I remove one of the = in the OldVar1=="a"
Is it possible to create a factor NewVariable with its levels and labels without filling them with the string values in advance? I was not able to find something on that, the tutorials I see have produced their data and they just have to label the existing values.
Also, I would like to give values to the rest of my cases who either belong to OptionA, OptionB, OptionC, etc, will this be possible setting a different if-statement for each one of them as the following?
NewVariable="OptionA" <- if(OldVar1=="a" & OldVar2=="b" & OldVar3=="c")
NewVariable="OptionB" <- if(OldVar1=="a" & OldVar2=="d" & OldVar3=="e")
=== EDIT ===
For the second "challenge" I followed the code suggested by DWin I produced an interaction of my three variables that I have in the if(...) above and set inside c() only the values that I needed, for example
OldVar.ALL.interactions <- with(data, interaction(OldVar1, OldVar2, OldVar3)
levels(OldVar.ALL.interactions) # search for the levels that we need to include
# in the NewVar
# below I follow DWin's code
NewVar <- factor(rep(NA, length(AnotherVarOfTheDataset) ),
levels=c("OptionA", "OptionB", ...))
NewVar[OldVar.ALL.interactions %in% c("...interaction.of.Old.Variables...")] <- "OptionA"
# the same as in OptionA for the rest of the levels
# the ** NewVar[ is.na(NewVar) ] <- "nonparticipant" ** of DWin's code is not needed
Is there any other way to solve this issue without using the interaction between the Old factor variables?