0

Recently I asked a different question here at stackoverflow. It can be found here 1: However, while the solutions seemed to be very concise, the results showed a huge error we couldn't solve. Long story short: I have a data frame in the following form:

df <- structure(list(X = c("A", "A", "B", "C", "C"), Y = c(1L, 2L, 
 3L, 1L, 3L)), .Names = c("X", "Y"), class = "data.frame", row.names = c(NA, 
 -5L))

I want a list like this:

$`A`
[1] 1 2

$`B`
[1] 3

$`C`
[1] 1 3

@akrun suggested using data.table since my data has 22 million rows. Accordingly, I used the following code.

library(data.table)
 DT <- as.data.table(df)
 DT1 <- DT[, list(Y=list(Y)), by=X]
 DT1$Y

However, my Y is a factor. And while the code works for an integer, it doesn't work for a factor. I get the following result, with the example data set and with the 22 million rows and with a sub sample of 200 rows.

DT1$Y
#[[1]]
 #[1] 1 3

 #[[2]]
 #[1] 1 3

 #[[3]]
 #[1] 1 3

Does anyone know why? I am using R 3.1.1 and data.table 1.9.2 edited for clarity

Community
  • 1
  • 1
Daniel Schultz
  • 320
  • 2
  • 13
  • 1
    I couldn't reproduce this behavior with data.table 1.9.4 and R 3.1.1 – talat Dec 08 '14 at 13:18
  • Can't reproduce with data.table 1.9.2. – nicola Dec 08 '14 at 13:26
  • [Here's the relevant updated post](http://stackoverflow.com/a/23288688/559784). It occurs in 1.9.2 and R v3.1+ only (due to changes in how R copies in 3.1+). Been fixed sometime ago. Please write back if upgrading doesn't fix the issue. – Arun Dec 08 '14 at 13:29
  • Ok I tried the example data on data.table 1.9.4 and it worked...Seems to be version 1.9.2. But I don't know why. Upgrading fixed the issue. – Daniel Schultz Dec 08 '14 at 13:39

0 Answers0