From what I can see here I would assume that data.table v1.8.0+ does not automatically convert strings to factors.
Specifically, to quote Matthew Dowle from that page:
No need for stringsAsFactors. Done like this in v1.8.0 : o character columns are now allowed in keys and are preferred to factor. data.table() and setkey() no longer coerce character to factor. Factors are still supported.
I'm not seeing that ... here's my R session transcript:
First, I make sure I have a recent enough version of data.table > 1.8.0
> library(data.table)
data.table 1.8.8 For help type: help("data.table")
Next, I create a 2x2 data.table. Notice that it creates factors ...
> m <- matrix(letters[1:4], ncol=2)
> str(data.table(m))
Classes ‘data.table’ and 'data.frame': 2 obs. of 2 variables:
$ V1: Factor w/ 2 levels "a","b": 1 2
$ V2: Factor w/ 2 levels "c","d": 1 2
- attr(*, ".internal.selfref")=<externalptr>
When I use stringsAsFactors in data.frame() and then call data.table(), all is well ...
> str(data.table(data.frame(m, stringsAsFactors=FALSE)))
Classes ‘data.table’ and 'data.frame': 2 obs. of 2 variables:
$ X1: chr "a" "b"
$ X2: chr "c" "d"
- attr(*, ".internal.selfref")=<externalptr>
What am I missing? Is data.frame() supposed to convert strings to factors, and if so, is there a "better way" of turning that behavior off?
Thanks!