17

Consider the array a:

> a <- array(c(1:9, 1:9), c(3,3,2))
> a
, , 1

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

, , 2

     [,1] [,2] [,3]
[1,]    1    4    7
[2,]    2    5    8
[3,]    3    6    9

How do we efficiently compute the row sums of the matrices indexed by the third dimension, such that the result is:

     [,1] [,2]
[1,]   12   12
[2,]   15   15
[3,]   18   18

??

The column sums are easy via the 'dims' argument of colSums():

> colSums(a, dims = 1)

but I cannot find a way to use rowSums() on the array to achieve the desired result, as it has a different interpretation of 'dims' to that of colSums().

It is simple to compute the desired row sums using:

> apply(a, 3, rowSums)
     [,1] [,2]
[1,]   12   12
[2,]   15   15
[3,]   18   18

but that is just hiding the loop. Are there other efficient, truly vectorised, ways of computing the required row sums?

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453

4 Answers4

15

@Fojtasek's answer mentioned splitting up the array reminded me of the aperm() function which allows one to permute the dimensions of an array. As colSums() works, we can swap the first two dimensions using aperm() and run colSums() on the output.

> colSums(aperm(a, c(2,1,3)))
     [,1] [,2]
[1,]   12   12
[2,]   15   15
[3,]   18   18

Some comparison timings of this and the other suggested R-based answers:

> b <- array(c(1:250000, 1:250000),c(5000,5000,2))
> system.time(rs1 <- apply(b, 3, rowSums))
   user  system elapsed 
  1.831   0.394   2.232 
> system.time(rs2 <- rowSums3d(b))
   user  system elapsed 
  1.134   0.183   1.320 
> system.time(rs3 <- sapply(1:dim(b)[3], function(i) rowSums(b[,,i])))
   user  system elapsed 
  1.556   0.073   1.636
> system.time(rs4 <- colSums(aperm(b, c(2,1,3))))
   user  system elapsed 
  0.860   0.103   0.966 

So on my system the aperm() solution appears marginally faster:

> sessionInfo()
R version 2.12.1 Patched (2011-02-06 r54249)
Platform: x86_64-unknown-linux-gnu (64-bit)

However, rowSums3d() doesn't give the same answers as the other solutions:

> all.equal(rs1, rs2)
[1] "Mean relative difference: 0.01999992"
> all.equal(rs1, rs3)
[1] TRUE
> all.equal(rs1, rs4)
[1] TRUE
Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
  • That's cool, I didn't even know there was an aperm() function! It's interesting to me that rs1 and rs3 are not the same, given that the output to me looks similar. Will have to look into this at some point :) – Tony Breyal Feb 27 '11 at 23:07
  • 1
    ahh, I see the issue now, you need to get rid of the "[3]" in rs3. I was only using it to get the final elapsed time which I could then run summary() on so as to account for run-to-run fluctuations :) – Tony Breyal Feb 27 '11 at 23:28
  • @Tony, ahh of course! I've updated the answer to this effect, thanks again for the suggestion. – Gavin Simpson Feb 27 '11 at 23:52
  • Hmm, in the absence of any further answers, I'm going to accept my own as it was the fastest of the provided solutions and the answer from @Fojtasek doesn't result in the requested answer even though it led me to `aperm()`. If anyone comes up with anything better than the `aperm()` version, I'll change the accepted answer. – Gavin Simpson Mar 01 '11 at 19:19
  • Oh my ... why is there no numpy like interface to R arrays -.- – jan-glx Jul 16 '19 at 14:22
6

You could chop up the array into two dimensions, compute row sums on that, and then put the output back together the way you want it. Like so:

rowSums3d <- function(a){
    m <- matrix(a,ncol=ncol(a))
    rs <- rowSums(m)
    matrix(rs,ncol=2)
}

> a <- array(c(1:250000, 1:250000),c(5000,5000,2))
> system.time(rowSums3d(a))
   user  system elapsed 
   1.73    0.17    1.96 
> system.time(apply(a, 3, rowSums))
   user  system elapsed 
   3.09    0.46    3.74 
Fojtasek
  • 3,492
  • 1
  • 26
  • 23
  • Thanks, but there are a couple of infelicities with your function. i) the line creating `z` doesn't seem to get used anywhere else, and ii) the function argument is `x` yet `a` is used throughout. – Gavin Simpson Feb 27 '11 at 21:27
  • Oops! That's what I get for bouncing around between different ideas in an interactive session and a separate window. I've edited these problems out. – Fojtasek Feb 27 '11 at 21:53
  • Which platform and version of R did you use for this? I'm getting a different relative timing on my machine for your code (see my answer below). Only asking in case the difference is due to platform/R-version. – Tony Breyal Feb 27 '11 at 22:05
  • Thanks for updating your function, however, on my system, `rowSums3d()` doesn't yield the same result as `apply()` - see my answer below. – Gavin Simpson Feb 27 '11 at 22:56
4

I don't know about the most efficient way of doing this, but sapply seems to do well

a <- array(c(1:9, 1:9), c(3,3,2))
x1 <- sapply(1:dim(a)[3], function(i) rowSums(a[,,i]))
x1
     [,1] [,2]
[1,]   12   12
[2,]   15   15
[3,]   18   18

x2 <- apply(a, 3, rowSums)
all.equal(x1, x2)
[1] TRUE

Which gives a speed improvement as follows:

> a <- array(c(1:250000, 1:250000),c(5000,5000,2))

> summary(replicate(10, system.time(rowSums3d(a))[3]))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  2.784   2.799   2.810   2.814   2.821   2.862 

> summary(replicate(10, system.time(apply(a, 3, rowSums))[3]))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  2.730   2.755   2.766   2.776   2.788   2.839 

> summary(replicate(10, system.time( sapply(1:dim(a)[3], function(i) rowSums(a[,,i])) )[3]))
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
  1.840   1.852   1.867   1.872   1.893   1.914 

Timings were done on:

# Ubuntu 10.10
# Kernal Linux 2.6.35-27-generic
> sessionInfo()
R version 2.12.1 (2010-12-16)
Platform: x86_64-pc-linux-gnu (64-bit)
Tony Breyal
  • 5,338
  • 3
  • 29
  • 49
  • Thanks for the answer. However, something is not right with your `sapply()` code as it appears to be returning a scalar value. See my answer for comparison code that shows the returned value is not the same as that from `apply()`. – Gavin Simpson Feb 27 '11 at 22:57
  • 2
    @Gavin See my comment in your answer, the reason for the scalar value is because of the "[3]" which isn't required for your comparisons :) – Tony Breyal Feb 27 '11 at 23:31
1

If you have a multi-core system you could write a simple C function and make use of the Open MP parallel threading library. I've done something similar for a problem of mine and I get an 8 fold increase on an 8 core system. The code will still work on a single-processor system and even compile on a system without OpenMP, perhaps with a smattering of #ifdef _OPENMP here and there.

Of course its only worth doing if you know that's what's taking most of the time. Do profile your code before optimising.

Spacedman
  • 92,590
  • 12
  • 140
  • 224
  • Multicoring apply family functions in R could not be easier (could be, actually, but let's imagine it could not). Using `snowfall`, after you've initialized your cores, you would do `sfSapply(...)` and you're set. – Roman Luštrik Feb 28 '11 at 09:34
  • seems a lot of trouble to go to when you can do something similar in fast R code using `colSums()`. It's a bit frustrating that `rowSums()` takes a different approach to `'dims'`, but I was hoping I'd overlooked something in using `rowSums()`. Also, the speed up from multi-threading would need to be significant to overcome the cost of dispatching and managing the threads/processes. – Gavin Simpson Feb 28 '11 at 10:26
  • 1
    In my likelihood code which is doing something similar to rowSums I get an 8x speedup - which is the difference between getting a few things done every day to getting one thing done every two days! Well worth the near-zero effort (I coded the whole thing in R first, then in C for a 10x speedup, added OpenMP for an ultimate 80x speedup) – Spacedman Feb 28 '11 at 13:10