What is exact difference between adressing column by $mycol, [[mycol]] and [, mycol, with=FALSE]?

Question

After reading data.table FAQ (section 1.5), I had an impression that all three ways of addressing the column are more or less equivalent. But at least the output of [, mycol, with=FALSE] is quite different from $mycol and [[mycol]]:

dt1 <- fread(
  " id,colA,colB
   id1,3,xxx
   id2,0,zzz
   id3,NA,yyy
   id4,0,aaa
     ")

dt1$colA <- factor(dt1$colA)

myvar="colA"

dt1$colA
# [1] 3    0    <NA> 0   
# Levels: 0 3

dt1[[myvar]]
# [1] 3    0    <NA> 0   
# Levels: 0 3

dt1[, myvar, with=FALSE]
# colA
# 1:    3
# 2:    0
# 3:   NA
# 4:    0

So, what is exact difference between those three approaches? Can I assume that $mycol and [[mycol]] are always identical? Why [, mycol, with=FALSE] "loses" factors?

Thanks in advance.

score 3 · Accepted Answer · edited Apr 16 '20 at 14:24

First part of your question, the difference between $ and [[, has been covered before in this question:

Indexing by [ is similar to atomic vectors and selects a list of the specified element(s).

Both [[ and $ select a single element of the list. The main difference is that $ does not allow computed indices, whereas [[ does. x$name is equivalent to x[["name", exact = FALSE]]. Also, the partial matching behavior of [[ can be controlled using the exact argument.

The notation dt1[, ..myvar] in data.table produces a data table with the columns evaluated in myvar. The result is a one-column data table, and the class of that column is factor.

The data frame equivalent would be: as.data.frame(dt1)[, myvar, drop = FALSE].

ah, I missed that the last form also gives `factor`. Thanks a lot for explaining! — Vasily A, Jun 10 '14 at 15:33

What is exact difference between adressing column by $mycol, [[mycol]] and [, mycol, with=FALSE]?

1 Answers1