I have a square matrix with a few tens of thousands rows and columns with only a few 1
beside tons of 0
, so I use the Matrix
package to store that in R in an efficient way. The base::matrix
object cannot handle that amount of cells, as I run out of memory.
My problem is that I need the inverse and also the Moore-Penrose generalized inverse of such matrices, that I cannot compute at the time being.
What I have tried:
solve
yields anError in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out of memory)
errorMASS::ginv
is not compatible with theMatrix
class- there is no direct way to convert a sparse
Matrix
to e.g.bigmemory::big.matrix
, and that latter neither does work withMASS::ginv
anyway if I try to compute the Choleski factorization of the matrix to later call
Matrix::chol2inv
on that, I get the following error message:Error in .local(x, ...) : internal_chm_factor: Cholesky factorization failed In addition: Warning message: In .local(x, ...) : Cholmod warning 'not positive definite' at file ../Cholesky/t_cholmod_rowfac.c, line 431
based on a related question, I also gave a try to the
pbdDMAT
package on a single node, butpbdDMAT::chol
yielded anCholmod error 'out of memory' at file ../Core/cholmod_memory.c, line 147
error message
Question in a nutshell: is there any way to compute the inverse and Moore-Penrose generalized inverse of such sparse matrix, beside falling back to using matrix
class on a computer with tons of RAM?
Quick reproducible example:
library(Matrix)
n <- 1e5
m <- sparseMatrix(1:n, 1:n, x = 1)
m <- do.call(rBind, lapply(1:10, function(i) {
set.seed(i)
m[-sample(1:n, n/3), ]
}))
m <- t(m) %*% m
Some descriptives (thanks to Gabor Grothendieck):
> dim(m)
[1] 100000 100000
> sum(m) / prod(dim(m))
[1] 6.6667e-05
> table(rowSums(m))
0 1 2 3 4 5 6 7 8 9 10
5 28 320 1622 5721 13563 22779 26011 19574 8676 1701
> table(colSums(m))
0 1 2 3 4 5 6 7 8 9 10
5 28 320 1622 5721 13563 22779 26011 19574 8676 1701
And some error messages:
> Matrix::solve(m)
Error in LU.dgC(a) : cs_lu(A) failed: near-singular A (or out of memory)
> base::solve(m)
Error in asMethod(object) :
Cholmod error 'problem too large' at file ../Core/cholmod_dense.c, line 105
> MASS::ginv(m)
Error in MASS::ginv(m) : 'X' must be a numeric or complex matrix
The goal of the bounty is to find a way to compute the Moore-Penrose generalized inverse of m
with less than 8Gb of RAM (speed and performance is not important).