One way to evaluate this is the inelegant
length(d$var[(d$var == 0) & (!is.na(d$var))])
(or slightly more compactly, sum(d$var==0 & !is.na(d$var))
)
I think your code illustrates some misunderstandings you are having about R syntax. Let's make a compact, reproducible example to illustrate:
d <- data.frame(var=c(7, 0, NA, 0))
As you point out, length(d$var[d$var==0])
will return 3, because NA==0
is evaluated as NA
.
When you enclose the value you're looking for in quotation marks, R evaluates it as a string. So length(d$var[d$var == "NA"])
is asking how many elements in d$var
are the character string "NA"
. Since there are no characters "NA"
in your data set, you get back the number of values that evaluate to NA
(because "NA"==NA
evaluates to NA
).
In order to answer your last question, look at what d$var[d$var==NA]
returns: a vector of NA
of the same length as your original vector. Again, any ==
comparison with NA
evaluates to NA
. Since all of the comparisons in that expression are to NA
, you'll get back a vector of NA
s that is the same length as your original vector.