Looking for faster way to implement logSumExp across multidimensional array

Question

I have a line in some R code I am writing that is quite slow. It applies logSumExp across a 4 dimensional array using the apply command. I'm wondering are there ways to speed it up!

Reprex: (this might take 10seconds or more to run)

library(microbenchmark)
library(matrixStats)

array4d <- array( runif(5*500*50*5 ,-1,0),
                  dim = c(5, 500, 50, 5) )
microbenchmark(
    result <- apply(array4d, c(1,2,3), logSumExp)
)

Any advice appreciated!

score 2 · Answer 1 · answered Jul 14 '20 at 10:09

2

rowSums is a less general version of apply that is optimised for speed when adding up, so this can be used to speed up the calculation. Note the caveat in the helpfile ?rowSums if it's important to maintain a difference in your calculations between NA and NaN.

library(microbenchmark)
library(matrixStats)

array4d <- array( runif(5*500*50*5 ,-1,0),
                  dim = c(5, 500, 50, 5) )
microbenchmark(
  result <- apply(array4d, c(1,2,3), logSumExp),
  result2 <- log(rowSums(exp(array4d), dims=3))
)


# Unit: milliseconds
#                                            expr      min       lq      mean    median        uq      max neval
# result <- apply(array4d, c(1, 2, 3), logSumExp) 249.4757 274.8227 305.24680 297.30245 328.90610 405.5038   100
# result2 <- log(rowSums(exp(array4d), dims = 3))  31.8783  32.7493  35.20605  33.01965  33.45205 133.3257   100

all.equal(result, result2)

#TRUE

This results in a 9x speed increase on my computer

answered Jul 14 '20 at 10:09

Miff

7,486
20
20

Indeed this gave a 37x speedup on my machine! But, I'm concerned about precision, it gives TRUE for ```all.equal()```, but not for ```identical()```. I was trying to implement a variation of this answer, but rowLogSumExps does not behave the same way as rowMeans so I can't make it work: https://stackoverflow.com/a/18633390/2498193 – user2498193 Jul 14 '20 at 10:21
1

@user2498193 In general you wouldn't expect two different ways of calculating it to give the same answer because of machine precision and rounding. `range(result - result2)` gives numbers around 10^-16, so I'd suggest that the differences are negligible – Miff Jul 14 '20 at 10:32
1

[Obligatory link](https://stackoverflow.com/questions/9508518/why-are-these-numbers-not-equal) on why the numbers are not quite the same – Miff Jul 14 '20 at 10:36
I had another query that I thought merited a follow on question if you have time (https://stackoverflow.com/questions/62894184/looking-for-speedups-for-multidimensional-array-code) - but if not thank you for your help thus far! – user2498193 Jul 14 '20 at 11:32
Alas! This has come back to bite me - with extreme data this is producing infinities, whilst the original does not – user2498193 Jul 17 '20 at 11:59
So I figured out this is due to the so called underflow issue with logSumExp. @Miff, I combined your magic with some dimensional voodoo inspired by the ```array()``` function code, and the 'logSumExp' trick (https://www.xarg.org/2016/06/the-log-sum-exp-trick-in-machine-learning/) to get a version that's not quite as fast as yours, but does not suffer this underflow problem. Posting as a new answer! – user2498193 Jul 17 '20 at 19:35

score 0 · Accepted Answer · answered Jul 17 '20 at 19:42

The otherwise great solution from @Miff was causing my code to crash with certain datasets as infinities were being produced which I eventually figured out was due to an underflow problem which can be avoided by using the 'logSumExp trick': https://www.xarg.org/2016/06/the-log-sum-exp-trick-in-machine-learning/

Taking inspiration from @Miff 's code, and the R apply() function, I made a new function to gives faster calculations while avoiding the underflow issue. Not quite as fast as @Miff 's solution however. Posting in case it helps others

apply_logSumExp <- function (X) {
    MARGIN <- c(1, 2, 3) # fixing the margins as have not tested other dims
    dl <- length(dim(X)) # get length of dim
    d <- dim(X) # get dim
    dn <- dimnames(X) # get dimnames
    ds <- seq_len(dl) # makes sequences of length of dims
    d.call <- d[-MARGIN]    # gets index of dim not included in MARGIN
    d.ans <- d[MARGIN]  # define dim for answer array
    s.call <- ds[-MARGIN] # used to define permute
    s.ans <- ds[MARGIN]     # used to define permute
    d2 <- prod(d.ans)   # length of results object
    
    newX <- aperm(X, c(s.call, s.ans)) # permute X such that dims omitted from calc are first dim
    dim(newX) <- c(prod(d.call), d2) # voodoo. Preserves ommitted dim dimension but collapses the rest into 1
    
    maxes <- colMaxs(newX)
    ans <- maxes + log(colSums(exp( sweep(newX, 2, maxes, "-"))) )
    ans <- array(ans, d.ans)
    
    return(ans)
}

 > microbenchmark(
+     res1 <- apply(array4d, c(1,2,3), logSumExp),
+     res2 <- log(rowSums(exp(array4d), dims=3)),
+     res3 <- apply_logSumExp(array4d)
+ )
Unit: milliseconds
                                          expr        min         lq       mean    median        uq       max
 res1 <- apply(array4d, c(1, 2, 3), logSumExp) 176.286670 213.882443 247.420334 236.44593 267.81127 486.41072
  res2 <- log(rowSums(exp(array4d), dims = 3))   4.664907   5.821601   7.588448   5.97765   7.47814  30.58002
              res3 <- apply_logSumExp(array4d)  12.119875  14.673011  19.635265  15.20385  18.30471  90.59859
 neval cld
   100   c
   100 a  
   100  b

Looking for faster way to implement logSumExp across multidimensional array

2 Answers2

Linked