2

I have a list of large (35000 x 3) matrices in R and I want to combine them into a single matrix but it would be about 1 billion rows long and would exceed the maximum object size in R.

The bigmemory package allows for larger matrices but doesn't appear to support rbind to put multiple matrices together.

Is there some other package or technique that supports the creation of a very large matrix from smaller matrices?

Also before you ask this is not a RAM issue, simply an R limitation even on 64-bit R.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453
falcs
  • 499
  • 2
  • 8
  • 25

2 Answers2

2

You could implement it with a loop:

library(bigmemory)

## Reproducible example
mat <- matrix(1, 50e3, 3)
l <- list(mat)
for (i in 2:100) {
  l[[i]] <- mat
}

## Solution
m <- ncol(l[[1]])  ## assuming that all have the same number of columns
n <- sum(sapply(l, nrow))

bm <- big.matrix(n, m)
offset <- 0
for (i in seq_along(l)) {
  mat_i <- l[[i]]
  n_i <- nrow(mat_i)
  ind_i <- seq_len(n_i) + offset
  bm[ind_i, ] <- mat_i
  offset <- offset + n_i
}

## Verif
stopifnot(offset == n, all(bm[, 1] == 1))
F. Privé
  • 11,423
  • 2
  • 27
  • 78
  • Seems like a sensible solution, did you ever get to try it out @falcs? I have a similar problem, but I also want the big.matrix to be filebacked. – Espen Riskedal Nov 27 '20 at 08:58
1

Not quite an answer, but a little more than a comment: are you sure that you can't do it by brute force? R now has long vectors (since version 3.0.0; the question you link to refers to R version 2.14.1): from this page,

Arrays (including matrices) can be based on long vectors provided each of their dimensions is at most 2^31 - 1: thus there are no 1-dimensional long arrays.

while the underlying atomic vector can go up to 2^52 -1 elements ("in theory .. address space limits of current CPUs and OSes will be much smaller"). That means you should in principle be able to create a matrix that is as much as ((2^31)-1)/1e9 = 2.1 billion rows long; since the maximum "long" object size is about 10^15 (i.e. literally millions of billions), a matrix of 1 billion rows and 3 columns should (theoretically) not be a problem.

Ben Bolker
  • 211,554
  • 25
  • 370
  • 453