0

I noticed a feature with the data.table package, which is probably somehow linked to the clever memory allocation the package does. I want to create a new data.table out of an existing one and remove columns from the new one. However, the := NULL routine removes columns from both data.tables without me specifically asking for it.

library(data.table)

dt1 <- data.table(A = rnorm(10,5,6), B = rnorm(10,2,1), C = rnorm(10,10,2))

dt2 <- dt1

names(dt1)
names(dt2)

dt2[, c("B", "C") := NULL]

names(dt2) # Expected
# [1] "A"
names(dt1) # Weird!
# [1] "A"

My current solution to avoid this is to convert the data.table to a data.frame, remove the columns, and convert it back to a data.table:

dt1 <- data.table(A = rnorm(10,5,6), B = rnorm(10,2,1), C = rnorm(10,10,2))

dt2 <- as.data.frame(dt1)
dt2 <- as.data.table(dt2[!names(dt2) %in% c("B", "C")])

names(dt1) # Expected
# [1] "A" "B" "C"
names(dt2) # Expected
# [1] "A"

There must be a more intuitive, memory and code efficient way of doing this. Any suggestions?

Mikko
  • 7,530
  • 8
  • 55
  • 92

1 Answers1

2

You should use

dt2 <- copy(dt1) 
PavoDive
  • 6,322
  • 2
  • 29
  • 55
  • 1
    Ok. Thanks! I knew that there would be an easy answer to this. Just couldn't find it (at the sea, bad internet). – Mikko Sep 01 '19 at 14:18