I have a large, sparse matrix saved in a RData file. The script to access this matrix will be kicked off from a console call to RScript. It is both time and resource intensive to load this matrix on every call from the script. Is there a way to hold a matrix in memory so that multiple calls from the console can use the matrix without having to load it as an object every single time?
Asked
Active
Viewed 69 times
0
-
2http://stackoverflow.com/questions/41569997/is-it-possible-to-run-r-as-a-daemon – Dason Apr 04 '17 at 17:56
2 Answers
1
Try the 'bigmatrix' package. Basically you create a matrix with a call to 'big.matrix()', then obtain a hook to that matrix through a call to 'describe()'. The content of the hook can then be used to attach the already loaded matrix into another process using 'attach.big.matrix()'.
Edit: an example:
Start 2 R sessions, 1 & 2
on Session 1:
require(bigmemory)
system.time(M <- matrix(rnorm(1e8), 1e4)) # ~9"
format(object.size(M), "Mb") # ~762Mb
system.time(M <- as.big.matrix(M)) # ~ 3"
hook = describe(M)
saveRDS(hook, "shared-matrix-hook.rds")
M[1:3,1:3]
on Session 2
require(bigmemory)
system.time(hook <- readRDS("shared-matrix-hook.rds")) # 0.001"
system.time(Mshared <- attach.big.matrix(hook)) # 0.002"
Mshared[1:3,1:3] # shows the same as session 1 did
Mshared[2,2] = 0 # check on session 1 that this change is present there

IBrum
- 345
- 1
- 9
-
Interesting approach. But how does it hold the "state" of the matrix in a quick-to-load manner from one script request to another? – Unknown Coder Apr 04 '17 at 23:49
-
one way for holding the "state" is to save the hook (the result from 'describe()') to a file, that will be read when there is another call. – IBrum Apr 05 '17 at 00:39
0
Should not the problem of sharing large data fundamentally introduce an architecture that can use data sharing like in-memory DB?

Soungno Kim
- 44
- 5