5

I just stumbled upon some weird behavior in data.table. In short, using ":=" to change (replace) the value of a column in a data.table seems to also change the values in another data.table (which is a copy of the original data.table before the := operation). Sample code is below.

Am I missing something fundamental about the otherwise excellent package, or should there be a bug report?

Sub-question: Is ifelse() the best way to change the values as done below (in a fairly large table, ~10m rows)? It does the job as expected as is quick enough (a few seconds) but with verbose=TRUE data.table complains ("RHS for item 1 has been duplicated. Either NAMED vector or recycled list RHS.) and I have not been able to decipher the message so far :)

library(data.table)
options(datatable.verbose=TRUE)
DT1 <- data.table(f=as.integer(c(1,2,1,1,1,2,1)))
DT2 <- DT1

tables()

DT1
DT2
identical(DT1, DT2) # OK, they should be identical.

# I am not sure ifelse() is the best way to do this, but it does what I want, even though data.table complains
DT1[, f := as.character(ifelse(f==1,"a","b"))]

tables()
DT1
DT2
identical(DT1, DT2) # Not OK -- why did DT2 change?

If relevant, my system is:

R version 2.15.3 (2013-03-01) -- "Security Blanket"
Platform: x86_64-w64-mingw32/x64 (64-bit)
data.table 1.8.8
All 943 tests in test.data.table() completed ok in 27.869sec

Thanks.

Peter
  • 1,016
  • 9
  • 20
  • use `DT2 <- copy(DT1)`. Otherwise they are pointing to the same memory location. There's no copy being made. – Arun Mar 12 '13 at 11:56
  • 2
    Not closing because it was a bad question. It's actually a great question with version numbers etc up front, and well written. Just happened to be a duplicate. The `ifelse()` is a different question. Each question on S.O. needs to be a single question please. – Matt Dowle Mar 12 '13 at 11:59
  • +1 and voting to close (for the reason @Matthew noted). Do feel free to re-ask the `ifelse()` portion as a seperate question. – Josh O'Brien Mar 12 '13 at 12:06
  • Thank you all. I see the question is a duplicate indeed (got all I need to know in the linked answer). Please feel free to also delete the question (in addition to closing), unless you think there might be some benefit in the different wording (which is why I failed to locate answer). – Peter Mar 12 '13 at 12:17
  • Good idea. Won't delete so it serves as a signpost. I've seen S.O. veterans say that's good practice. – Matt Dowle Mar 12 '13 at 13:39

0 Answers0