0

I am wondering if it is possible to subset a data.table by reference.

A solution that involves updating by assignment is:

iris <- as.data.table(iris)

iris <- iris[Species == "virginica"]

The downside of this approach is it copies the filtered data.table. If possible, I would like to filter by reference, possibly using the := operator and the .SD shortcut.

Josh Persi
  • 83
  • 1
  • 7
  • It's not clear what you're trying to do. If all you want is to replace `iris` with a subset of itself, it will necessarily create a new object in memory. – jblood94 Jun 14 '23 at 18:38
  • Is that because subseting behaves differently than adding or removing columns? I'm trying to understand if there is a way to subset by reference just like there is a way to add or remove columns by reference. – Josh Persi Jun 14 '23 at 18:48

1 Answers1

1

Refer to How to delete a row by reference in data.table?

You can modify a subset of a data.table by reference, though:

iris <- as.data.table(iris)
address(iris)
#> [1] "000001dc87099330"

An object that is a row subset of iris requires a new object in memory.

iris2 <- iris[Species == "virginica"]
address(iris2)
#> [1] "000001dc876cdc30"

Modifying a subset of a column or adding/deleting a column can be done by reference.

iris[Species == "virginica", Species := "virg"]
address(iris)
#> [1] "000001dc87099330"
iris[,Species := NULL]
address(iris)
#> [1] "000001dc87099330"
iris[,Species := "virginica"]
address(iris)
#> [1] "000001dc87099330"
jblood94
  • 10,340
  • 1
  • 10
  • 15