I am kind of new to R and have some issues generating a dummy variable by evaluating a number of conditions.
I am trying to create the dummy variable 'GRDUMMY'. GRDUMMY should take the value 1 if:
- SG_MA > SG_MA_Year_Avg & LIQ < LIQ_Year_Avg
Otherwise, it should take value 0.
One complicating issue I have is that I have missing values in both SG_MA and LIQ (although not in SG_MA_Year_Avg and LIQ_Year_Avg).
To generate the dummy variable and handle these issues, I have tried the following code:
for(i in 1:nrow(Merge_GRDUMMY)){
if(is.na(Merge_GRDUMMY$SG_MA[i])){
Merge_GRDUMMY$GRDUMMY <- "NA"
}else if(is.na(Merge_GRDUMMY$LIQ[i])){
Merge_GRDUMMY$GRDUMMY <- "NA"
}else if(Merge_GRDUMMY$SG_MA[i] > Merge_GRDUMMY$SG_MA_Year_Avg[i] & Merge_GRDUMMY$LIQ[i] < Merge_GRDUMMY$LIQ_Year_avg[i]){
Merge_GRDUMMY$GRDUMMY <- 1
}else{
Merge_GRDUMMY$GRDUMMY <- 0}
}
Sample data:
> dput(Merge_GRDUMMY[1:4, c(14, 16, 21, 22)])
structure(list(SG_MA = c(NA_real_, NA_real_, NA_real_, NA_real_
), LIQ = c(-0.166091210233936, -0.238975053258208, -0.0423391360788804,
-0.0255328112422608), SG_MA_Year_Avg = c(NaN, NaN, NaN, NaN),
LIQ_Year_avg = c(-0.0460118085010656, -0.0460118085010656,
-0.0460118085010656, -0.0460118085010656)), row.names = c(NA,
4L), class = "data.frame")
My problem is, it seems the above loop executes all statements and thus assigns value "0" to all observations, even those with missing values. Any tips on what I am doing wrong?
Many thanks!