0

Is it possible to do? By profiling in RStudio I found that my simple row selection operations using the table's key are allocating an untoward amount of memory (aggregated over thousands of slices at different keys).

library(data.table)
foo = data.table(name=letters, v1=rnorm(1:10000), v2=rnorm(1:10000), v3=rnorm(1:10000))
setkey(foo, name)
for(i in 1:10000) slice = foo[letters[i%%26]]

Profile the for loop and see it allocate 500mb.

Matt Chambers
  • 2,229
  • 1
  • 25
  • 43
  • 3
    Please read the info about [how to ask a good question](http://stackoverflow.com/help/how-to-ask) and how to give a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610). This will make it much easier for others to help you. – Jaap Jul 29 '16 at 19:39
  • 2
    It is my understanding that this is not possible as of right now. You can view the issue here: https://github.com/Rdatatable/data.table/issues/635 – Boudewijn Aasman Jul 29 '16 at 19:57
  • Hard to know what you mean, but no, data.table does not have views like https://en.wikipedia.org/wiki/View_(SQL) You can get that in Stata, though, fwiw. – Frank Jul 29 '16 at 20:00
  • Depends on what you want to do with the slice. If you don't need to persist it then you can call gc() to release the memory (might not always help). Alternatively instead of creating actual slices just get the rows which match your criteria. – Rohit Das Jul 29 '16 at 20:15
  • @BoudewijnAasman: That pretty much answers my question. Thanks! – Matt Chambers Jul 29 '16 at 20:16
  • @RohitDas What's the difference between getting the rows and creating the slice? Whether it's assigned or not? I tried `for(i in 1:10000) if(sum(foo[letters[(i%%26)+1]]$v1) > 51) print(i)` and I get the same memory profile. – Matt Chambers Jul 29 '16 at 20:18

0 Answers0