1

this is a lame question I guess, but i don't understand what's is happining. If I go:

sum(is.na(census$wd))

It returns 4205

But if I go with:

sum(census$wd == NA)

It returns "NA"

I just would like to understand whats is happening. If I do str(census), wd shows up as:

$ wd         : num  NA 0.65 0.65 0.65 0.78 0.78 0.78 0.78 0.78 0.78 ...

Can anyone explains why the codes return different outputs? Thank you!

A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
TWest
  • 765
  • 2
  • 6
  • 27

1 Answers1

4

== in R is a comparison. But you can not compare something to NA in Ras the following quote from ?Comparison states:

Missing values (NA) and NaN values are regarded as non-comparable even to themselves, so comparisons involving them will always result in NA.

In contrast is.na indicates which elements are missing regardless of their type. So it returns a vector of TRUE and FALSE entries.

> a <- c(NA,1,2,3)
> a == NA
[1] NA NA NA NA

> is.na(a)
[1]  TRUE FALSE FALSE FALSE

this is why sum is working with is.na (interpreting TRUE=1 and FALSE=0 but it cannot sum up a vector of NA's (generated by ==NA)

Rentrop
  • 20,979
  • 10
  • 72
  • 100