I have a ~20,000x20,000 data, how do i convert the from data.table()
to a matrix
efficiently in terms of speed and memory?
I tried m = as.matrix(dt)
but it takes very long with many warnings. df = data.frame(dt)
takes very long and result in reaching memory limits as well.
Is there any efficient way to do this? Or, simply a function in data.table which returns dt
as as matrix form(as required to feed into a statistical model using the glmnet
package)?
Simply wrapping into as.matrix gives me below error:
x = as.matrix(dt)
Error: cannot allocate vector of size 2.9 Gb
In addition: Warning messages:
1: In unlist(X, recursive = FALSE, use.names = FALSE) : Reached total allocation of 8131Mb: see help(memory.size)
2: In unlist(X, recursive = FALSE, use.names = FALSE) : Reached total allocation of 8131Mb: see help(memory.size)
3: In unlist(X, recursive = FALSE, use.names = FALSE) : Reached total allocation of 8131Mb: see help(memory.size)
4: In unlist(X, recursive = FALSE, use.names = FALSE) : Reached total allocation of 8131Mb: see help(memory.size)
My OS: I have 64 bit Windows7 and 8gb ram, my Windows task manager shows Rgui.exe taking up spaces more than 4gb before and were still fine though.