Make Dataframe persist for multiple requests

Question

I have a large, sparse matrix saved in a RData file. The script to access this matrix will be kicked off from a console call to RScript. It is both time and resource intensive to load this matrix on every call from the script. Is there a way to hold a matrix in memory so that multiple calls from the console can use the matrix without having to load it as an object every single time?

http://stackoverflow.com/questions/41569997/is-it-possible-to-run-r-as-a-daemon — Dason, Apr 04 '17 at 17:56

IBrum · Answer 1 · 2017-04-05T14:07:20.540

Try the 'bigmatrix' package. Basically you create a matrix with a call to 'big.matrix()', then obtain a hook to that matrix through a call to 'describe()'. The content of the hook can then be used to attach the already loaded matrix into another process using 'attach.big.matrix()'.

Edit: an example:

Start 2 R sessions, 1 & 2

on Session 1:

require(bigmemory)
system.time(M <- matrix(rnorm(1e8), 1e4)) # ~9"
format(object.size(M), "Mb") # ~762Mb
system.time(M <- as.big.matrix(M)) # ~ 3"

hook = describe(M)
saveRDS(hook, "shared-matrix-hook.rds")
M[1:3,1:3]

on Session 2

require(bigmemory)
system.time(hook <- readRDS("shared-matrix-hook.rds")) # 0.001"

system.time(Mshared <- attach.big.matrix(hook)) # 0.002"

Mshared[1:3,1:3] # shows the same as session 1 did
Mshared[2,2] = 0 # check on session 1 that this change is present there

Interesting approach. But how does it hold the "state" of the matrix in a quick-to-load manner from one script request to another? — Unknown Coder, Apr 04 '17 at 23:49
one way for holding the "state" is to save the hook (the result from 'describe()') to a file, that will be read when there is another call. — IBrum, Apr 05 '17 at 00:39

score 0 · Answer 2 · answered Apr 05 '17 at 00:37

0

Should not the problem of sharing large data fundamentally introduce an architecture that can use data sharing like in-memory DB?

answered Apr 05 '17 at 00:37

Soungno Kim

44
5

Make Dataframe persist for multiple requests

2 Answers2

on Session 1:

on Session 2