5

I have an array of matrices.

dims <- c(10000,5,5)
mat_array <- array(rnorm(prod(dims)), dims)

I would like to perform a matrix-based operation (e.g. inversion via the solve function) on each matrix, but preserve the full structure of the array.

So far, I have come up with 3 options:

Option 1: A loop, which does exactly what I want, but is clunky and inefficient.

mat_inv <- array(NA, dims)
for(i in 1:dims[1]) mat_inv[i,,] <- solve(mat_array[i,,])

Option 2: The apply function, which is faster and cleaner, BUT squishes each matrix down to a vector.

mat_inv <- apply(mat_array, 1, solve)
dim(mat_inv)
[1]    25 10000

I know I can set the output dimensions to match those of the input, but I'm wary of doing this and messing up the indexing, especially if I had to apply over non-adjacent dimensions (e.g. if I wanted to invert across dimension 2).

Option 3: The aaply function from the plyr package, which does exactly what I want, but is MUCH slower (4-5x) than the others.

mat_inv <- plyr::aaply(mat_array, 1, solve)

Are there any options that combine the speed of base::apply with the versatility of plyr::aaply?

Alexey Shiklomanov
  • 1,592
  • 13
  • 23
  • 2
    I don't see anything wrong with using a `for` loop like that since you are preallocating the result object. Besides, it's significantly faster than `aaply`, so why not use it? – nrussell Jun 10 '16 at 18:04
  • 2
    It rapidly becomes clunky once the dimensionality of the problem increases. In my actual analysis, I have a 4D array containing matrices (output from a Monte Carlo analysis), meaning that my loop would have to be `for (i in dim1) for(j in dim2) ...`, which is much clunkier than doing `apply(obj, 1:2, func)`. Also, as I said, [loops in R are notoriously slow](http://stackoverflow.com/questions/7142767/why-are-loops-slow-in-r), and just because they're faster than the antiquated `aaply` doesn't mean there isn't a better approach out there...which is exactly why I posed the question :) – Alexey Shiklomanov Jun 12 '16 at 17:03
  • Loops aren't really any more inherently slow the equivalent constructs in R, like `apply`, etc.; this is a common misconception. The problem is that people often abuse them by not preallocating objects that are modified by the loop operation, which is inefficient. – nrussell Jun 12 '16 at 17:10
  • [For example](https://gist.github.com/nathan-russell/84f7bde6e0c4afdb4ac055fe96eb5342#file-loop-vs-apply-r) (of course, `rowSums` would be the ideal choice in that example, but regardless...). – nrussell Jun 12 '16 at 17:30

0 Answers0