0

I was working with a dataframe similar to this one, called DB_reduced:

sex BLT BLN BD
f NA 45 2
f 3 46 NA
m 3.5 NA 1
f 4 43 1
NA 3.4 46 3
f 3.4 46 3
NA 3.6 41 3

I was expected to get a similar result with this two codes:

DB_reduced[DB_reduced$sex == "f", 2] # first line
# or
subset(DB_reduced, DB_reduced$Sexo == "f", select = 2, drop = TRUE) # second line

but rather than just finish with the same dataframe, the first returns:

sex  BLT
f    NA
f    3
f    4
NA   3.4
f    3.4
NA   3.5

and the second:

sex  BLT
f    NA
f    3
f    4
f    3.4

Why the difference? I thought that both codes worked in tha same way. How can I modify the first line to obtain the same result as the second?

Thanks all!

MJRC
  • 21
  • 3
  • 1
    You can change the first line to `DB_reduced[DB_reduced$sex == "f" & !is.na(DB_reduced$sex), 2]` – MrFlick Jun 21 '22 at 20:59

1 Answers1

1

The documentation for ?subset specifies the following:

subset  logical expression indicating elements or rows to keep: missing values are taken as false.

so it drops NAs by default. You can get the same result using [ by adding & !is.na(DB_reduced$sex) as noted in the comments.

joran
  • 169,992
  • 32
  • 429
  • 468