0

I have 2 multidimensional arrays - a 4D array and a 3D array - and some code to to find the maximum of the 4D array along a dimension, and make an index for selecting from the 3D array based on this. At the moment it's quite slow and I'd like to speed things up.

Reprex:

library(microbenchmark)

# Make some arrays to test with
array4d <- array( runif(5*500*50*5 ,-1,0),
                  dim = c(5, 500, 50, 5) )
array3d <- array( runif(5*500*5, 0, 1),
                        dim = c(5, 500, 5))

# The code of interest
microbenchmark( {
    max_idx <- apply(array4d, c(1,2,3), which.max )
    selections <- list()
    for( i in 1:dim(array4d)[3] ){
        selections[[i]] <- apply(array3d, c(1,2), which.max) == max_idx[ , , i]
    }
})

Any tips appreciated!

(A side issue is I'm considering replacing which.max by nnet::which.is.max to have random breaking of ties)

Edit: A faster solution thanks to @GKi, but I'm still hoping for some speedups:

max_idx <- apply(array4d, c(1,2,3), which.max)
max_idx2 <- apply(array3d, c(1,2), which.max)
selections <- lapply(seq_len(dim(array4d)[3]), function(i) max_idx2 == max_idx[ , , i])
user2498193
  • 1,072
  • 2
  • 13
  • 32

1 Answers1

1

You can put apply(array3d, c(1,2), which.max) outside the loop.

microbenchmark( {
  max_idx <- apply(array4d, c(1,2,3), which.max)
  max_idx2 <- apply(array3d, c(1,2), which.max)
  selections <- lapply(seq_len(dim(array4d)[3]), function(i) max_idx2 == max_idx[ , , i])
},
{
  max_idx <- apply(array4d, c(1,2,3), which.max )
  selections <- list()
  for( i in 1:dim(array4d)[3] ){
    selections[[i]] <- apply(array3d, c(1,2), which.max) == max_idx[ , , i]
  }
})
#      min       lq     mean   median       uq      max neval cld
# 204.1650 228.0010 260.3101 256.0132 271.6664 433.8932   100  a 
# 396.5284 448.4167 495.3885 487.7741 530.9028 693.5601   100   b
GKi
  • 37,245
  • 2
  • 26
  • 48
  • Argh can't believe I missed that. Will incorporate it into the question - I know the apply commands themselves are slow too from here (https://stackoverflow.com/a/62892754/2498193), but I haven't been able to figure out how to apply this aprpoach with which.max – user2498193 Jul 14 '20 at 11:59
  • 1
    @user2498193 so one part you are looking for is something that is significant faster than `apply(array4d, c(1,2,3), which.max)` ? If yes maybe ask a question only for that topic. – GKi Jul 14 '20 at 12:20
  • Ok can do (never too sure what merits a seperate question on here to be honest). Just reprofiled my code with your suggestion and its considerably faster already - many thanks – user2498193 Jul 14 '20 at 12:25