R (version 3.3.3) is giving me some unexpected behavior when subsetting a data frame on a condition based on a character column. Here is an example:
foo <- data.frame(bar = c('a',NA,'b','a'),
baz = 1:4,
stringsAsFactors = FALSE)
foo
looks like this:
bar baz
1 a 1
2 <NA> 2
3 b 3
4 a 4
I want to get all rows of this data frame where bar != "a"
, so I call:
foo[foo$bar != 'a', ]
This returns:
bar baz
NA <NA> NA
3 b 3
I do not understand why the first entry in the second column is NA
and not 2
. Please help me explain this strange behavior.