12

When I copy a data.table and modify the new one the original one gets altered. Is this a normal behaviour?

dt = data.table(zone=1:5, pc=11:15)
dtt = dt
dtt[, pc := pc*2 ]
dtt

       zone pc
    1:    1 22
    2:    2 24
    3:    3 26
    4:    4 28
    5:    5 30

dt

       zone pc
    1:    1 22
    2:    2 24
    3:    3 26
    4:    4 28
    5:    5 30

I have no problem when creating the new data.table more explicitely: dtt = data.table(dt)

Antoine Gautier
  • 623
  • 8
  • 25
  • 3
    I'd like to vote to close this question. You can check the details in this post. http://stackoverflow.com/questions/10225098/understanding-exactly-when-a-data-table-is-a-reference-to-vs-a-copy-of-another – Bigchao Apr 09 '14 at 08:07

1 Answers1

12

When you assign a new variable to an already existing variable, R doesn't create a copy, but just points to the new variable, which is very nice as you don't want to make copies unless you absolutely need to - copy on modify.

After this, since you use the:= operator, which modifies in-place (by reference), and since at the moment, both objects are pointing to the same location, it gets reflected on both the objects.

The fix is to explicitly copy the data.table using copy() function and then assign by reference as follows:

dtt = copy(dt)     ## dt and dtt are not pointing to same locations anymore
dtt[, pc := pc*2]  ## assignment by reference doesn't affect dt

HTH

Arun
  • 116,683
  • 26
  • 284
  • 387
Sandman
  • 795
  • 7
  • 16
  • My problem was with R data table (I first forgot to mention it sorry). I guess it implements the same data structure as the one you point out. You are right about explicitely copying the table cf. my edit. – Antoine Gautier Apr 09 '14 at 08:02
  • @Arun Anything to help is good :) – Sandman Apr 10 '14 at 11:26