Faster alternative to for loops for einstein summation?

Question

I have a piece of code ported from Python to R. The original Python version uses np.einsum. Since I could not find an np.einsum equivalent in R, and I wanted to be sure I understood it, I directly coded it using for loops. Now, I'm wondering if there is a faster alternative.

Example code:

n = 2 ; d = 3 ; nx = 4 ; v = 5

array4d <- array( runif(n*nx*v*d ,-1,0),
                  dim = c(n, nx, v, d) )

array3d <- array( runif(n*v*d ,-1,0),
                  dim = c(n, v, d) )

einsum_result <- array( rep(0, n*nx*d),
                        dim = c(n, nx, d))

# original Python version: np.einsum('ikl,ijkl->ijl', array3d, array4d, optimize=False)
# R version
for (i in 1: n) {
    for( j in 1: nx) {
        for( l in 1: d ) { 
            einsum_result[i, j, l] <- einsum_result[i, j, l] +
                sum( array3d[i, , l] * array4d[i, j, , l])
        }}}

I have tried to remove the j loop (since j / nx will usually be the biggest number) using matrix multiplication, but could not work it out correctly. Any advise appreciated!

Not exactly an answer to your question: I'm not familiar with einstein summation. But in package `calculus` there is a function `einstein`. Maybe that's what you are looking for? — Martin Gal, Jul 16 '20 at 10:09
Yeah I found that allright but I don't think if functions the same as the Python version. Explainer on numpy.einsum here: https://stackoverflow.com/questions/26089893/understanding-numpys-einsum I'm not a huge Python fan but I do like this function! I actually find this answer clearer than the accepted answer: https://stackoverflow.com/a/59858877/2498193 But my main question is about speed - If you see a way to make my for loops faster can forget the einsum part! — user2498193, Jul 16 '20 at 10:31

score 1 · Answer 1 · answered May 25 '22 at 12:38

Take a look at the einsum package inspired by the NumPy function. It's einsum function isn't especially fast, but the einsum_generator function will return a cpp function (with code) that is pretty fast (once it's compiled).

library(einsum)

set.seed(6)

n = 5 ; d = 10; nx = 400; v = 15

array4d <- array(runif(n*nx*v*d ,-1,0),
                 dim = c(n, nx, v, d))

array3d <- array(runif(n*v*d ,-1,0),
                 dim = c(n, v, d))

einsum_result <- array(rep(0, n*nx*d),
                       dim = c(n, nx, d))

einsum1 <- function(einsum_result) {
  for (i in 1: n) {
    for( j in 1: nx) {
      for( l in 1: d ) { 
        einsum_result[i, j, l] <- einsum_result[i, j, l] +
          sum( array3d[i, , l] * array4d[i, j, , l])
      }
    }
  }
  einsum_result
}

einsum2 <- function(einsum_result) {
  for (i in 1: n) {
    for( l in 1: d ) {     
      einsum_result[i, , l] <- einsum_result[i, , l] +
        rowSums(sweep(array4d[i, , , l], MARGIN=2, array3d[i, , l], `*`) )
    }
  }
  einsum_result
}

system.time(einsumCpp <- einsum_generator("ikl, ijkl -> ijl"))
#> user  system elapsed 
#> 0.07    0.07    7.83
microbenchmark::microbenchmark(einsum1 = einsum1(einsum_result),
                               einsum2 = einsum2(einsum_result),
                               einsum = einsum("ikl, ijkl -> ijl", array3d, array4d),
                               einsumCpp = einsumCpp(array3d, array4d),
                               check = "equal")
#> Unit: microseconds
#>      expr        min         lq      mean     median        uq        max neval
#>   einsum1  48950.100  52293.800  56474.49  55209.801  59354.95  76777.201   100
#>   einsum2   9493.801  10342.201  11381.02  11093.451  12233.45  15554.401   100
#>    einsum 168312.101 175950.002 187269.22 183263.302 192122.10 250520.301   100
#> einsumCpp    892.401   1019.101   1338.10   1356.951   1521.45   2599.701   100

user2498193 · Answer 2 · 2020-07-16T12:33:48.810

Ok I made some progress to eliminate the j loop using sweep and rowSums, so posting this as an answer. I've increased the dimensions of the arrays because the benefit wasn't really appreciable with the small array in the original question.

n = 5 ; d = 10; nx = 400; v = 15

array4d <- array( runif(n*nx*v*d ,-1,0),
                  dim = c(n, nx, v, d) )

array3d <- array( runif(n*v*d ,-1,0),
                  dim = c(n, v, d) )

einsum_result1 <- array( rep(0, n*nx*d),
                        dim = c(n, nx, d))
einsum_result2 <- einsum_result1
    
microbenchmark({
    for (i in 1: n) {
        for( j in 1: nx) {
            for( l in 1: d ) { 
                einsum_result1[i, j, l] <- einsum_result1[i, j, l] +
                    sum( array3d[i, , l] * array4d[i, j, , l])
            }}} },
    {
    for (i in 1: n) {
        #for( j in 1: nx) {
        for( l in 1: d ) {     
            einsum_result2[i, , l] <- einsum_result2[i, , l] +
                rowSums( sweep(array4d[i, , , l], MARGIN=2, array3d[i, , l], `*`) )
        }}})

     min       lq     mean   median       uq       max neval cld
 40.53148 42.64406 48.30703 47.65992 50.87022 179.93073   100   b
 10.86891 11.09062 12.02787 11.22610 11.56781  27.29123   100  a 
> identical(einsum_result1, einsum_result2)
[1] TRUE

Anyone see a way to remove any remaining loops and replace with vectorised code? I've leave the question open for a while but if no suggestions I guess I'll accept my own answer

Faster alternative to for loops for einstein summation?

2 Answers2