0

I have a data frame df with variable x. However, two different expression to check on NA give me different results. Can anyone explain?

sum(is.na(df$x)
#[1] 41

df %>% filter(x==NA)
#A tibble: 0 x 1`
Anders Ellern Bilgrau
  • 9,928
  • 1
  • 30
  • 37
  • Perhaps the answer to this question will clear some things up: https://stackoverflow.com/questions/25100974/na-matches-na-but-is-not-equal-to-na-why – sumshyftw Feb 25 '19 at 19:02

1 Answers1

0

Note that a comparison with NA via == (nearly) always evaluates to NA. This is easily demonstrated with:

x <- c(1, 2, NA, 4)
x == NA
#[1] NA NA NA NA

See help("NA") and help("=="). From the latter documentation:

Missing values (NA) and NaN values are regarded as non-comparable even to themselves, so comparisons involving them will always result in NA.

So your dplyr code should be:

df %>% filter(is.na(x))
Anders Ellern Bilgrau
  • 9,928
  • 1
  • 30
  • 37