I am encountering some trouble trying to create a dummy variable in a dataset, based on conditions in other variables. My question is probably less about code and more about logic, so the solution might just consist in getting all the logical operators right. But at that I have been failing.
I will copy my actual code here so that I don't inadvertently remove my mistake in creating a generic sample.
"Eb" is a dataset consisting of a Eurobarometer survey, and I am interested in three variables in it. I want to assign values of 1 to a specific response pattern, and 0 to all respondents not exhibiting this pattern. A missing value in one or more of the relevant variables should assign a 0. The following code works:
eb$dummy1 <- ifelse(eb$v246 == 1 & !is.na(eb$v246) & eb$v247 == 1 & !is.na(eb$v247)
& eb$v250 == 4 & !is.na(eb$v250), "1", "0")
table(is.na(eb$dummy1))
FALSE
12995
But if I want to relax the selection criteria so that not only the strong agrees/disagrees count, but also the slight agrees/disagrees, it does not work anymore.
eb$dummy2 <- ifelse (!is.na(eb$v246) & eb$v246 == 1 | eb$v246 == 2
& !is.na(eb$v247) & eb$v247 == 1 | eb$v247 == 2
& !is.na(eb$v250) & eb$v250 == 4 | eb$v250 == 3, "1", "0")
table(is.na(eb$dummy2))
FALSE TRUE
11769 1226
What's especially strange is that as I change the order of the arguments following ifelse
, the number of NAs fluctuates (between 1000 and 1400). That's why I'm suspecting that I am just not using the logical operators correctly.
Any help would be much appreciated!