Your issue with subset()
was only about the syntax for calling it with a logical column vector (its third arg, not its second). You can either use subset()
or plain logical indexing. The latter is recommended.
The help page ?subset
tells you its optional second arg ('subset') is a logical row-vector, and its optional third arg ('select') is a logical column-vector:
subset: logical expression indicating elements or rows to keep:
missing values are taken as false.
select: expression, indicating columns to select from a data frame.
So you want to call it with this logical column-vector:
> DF[nrow(DF),] == 50
[1] TRUE FALSE FALSE FALSE
There are two syntactical ways to leave subset()
's second arg default and pass the third arg:
# Explicitly pass the third arg by name...
> subset(DF, select=(DF[nrow(DF),] == 50) )
# Leave 2nd arg empty, it will default (to NULL)...
> subset(DF, , (DF[nrow(DF),] == 50) )
[,1] [,2]
[1,] 10 50
[2,] 21 21
[3,] 11 30
[4,] 50 50
The second way is probably preferable as it looks like generic row,col-indexing, and also doesn't require you to know the third arg's name.
(As a mnemonic, in R and SQL terminology, understand that 'select' implicitly means 'column-indices', and 'filter'/'subset' implicitly means 'row-indices'. Or in data.table
terminology they're called i-indices, j-indices respectively.)