1

When I run the following code using the package 'data.table' it appears that dt and dt1 are linked. Both of the tables now have the new Calc variable in it even though I only defined it in dt1.. What is going on here please?

library(data.table)
dt <- data.table(Var1 = c(1,2,3), Var2 = c(4,5,6))
dt1 <- dt
dt1[, Calc := Var1 + Var2]
  • 1
    Use `dt1 <- copy(dt)` – akrun Aug 17 '22 at 17:48
  • Thanks, so the other command links the tables? – billybob196 Aug 17 '22 at 17:56
  • with `data.table`, set operations as well as `:=` use reference. So, if the newly created object is not copied, it gets modified – akrun Aug 17 '22 at 18:01
  • 1
    It might be useful to know that this is really a normal behavior for *most* programming languages. Usually, this is referred to as a shallow copy. A deep copy ensures the two objects aren't linked. In R, though, it's normally always a deep copy. – Kat Aug 17 '22 at 18:14
  • 4
    @Kat in R it is a normally shallow copy, deep copy is made later on only in case of altering the shallow copied object. It is called copy-on-write. – jangorecki Aug 17 '22 at 19:24
  • @billybob196, this is by design in `data.table`'s referential semantics. If you want `dt` and `dt1` to be completely separate objects, then as akrun suggested, use `dt1 <- copy(dt)`. – r2evans Aug 17 '22 at 20:15

0 Answers0