Omit inf from row sum in R

Question

So I am trying to sum the rows of a matrix, and there are inf's within it. How do I sum the row, omitting the inf's?

I'm shocked that no one has asked the traditional StackOverflow question: "What have you tried?" — Ryan Amos, Mar 13 '13 at 21:40
I'm shocked at +9 (so far). Upvote clearly states *This question shows research effort* — Simon O'Hanlon, Mar 13 '13 at 21:46

score 36 · Answer 1 · answered Mar 13 '13 at 18:26

36

Multiply your matrix by the result of is.finite(m) and call rowSums on the product with na.rm=TRUE. This works because Inf*0 is NaN.

m <- matrix(c(1:3,Inf,4,Inf,5:6),4,2)
rowSums(m*is.finite(m),na.rm=TRUE)

answered Mar 13 '13 at 18:26

Joshua Ulrich

173,410
32
338
418

Jouni Helske · Answer 2 · 2013-03-13T18:42:03.723

22

A[is.infinite(A)]<-NA
rowSums(A,na.rm=TRUE)

Some benchmarking for comparison:

library(microbenchmark)


rowSumsMethod<-function(A){
 A[is.infinite(A)]<-NA
 rowSums(A,na.rm=TRUE)
}
applyMethod<-function(A){
 apply( A , 1 , function(x){ sum(x[!is.infinite(x)])})
}

rowSumsMethod2<-function(m){
  rowSums(m*is.finite(m),na.rm=TRUE) 
}

rowSumsMethod0<-function(A){
 A[is.infinite(A)]<-0
 rowSums(A)
}

A1 <- matrix(sample(c(1:5, Inf), 50, TRUE), ncol=5)
A2 <- matrix(sample(c(1:5, Inf), 5000, TRUE), ncol=5)
microbenchmark(rowSumsMethod(A1),rowSumsMethod(A2),
               rowSumsMethod0(A1),rowSumsMethod0(A2),
               rowSumsMethod2(A1),rowSumsMethod2(A2),
               applyMethod(A1),applyMethod(A2))

Unit: microseconds
               expr      min        lq    median        uq      max neval
  rowSumsMethod(A1)   13.063   14.9285   16.7950   19.3605 1198.450   100
  rowSumsMethod(A2)  212.726  220.8905  226.7220  240.7165  307.427   100
 rowSumsMethod0(A1)   11.663   13.9960   15.3950   18.1940  112.894   100
 rowSumsMethod0(A2)  103.098  109.6290  114.0610  122.9240  159.545   100
 rowSumsMethod2(A1)    8.864   11.6630   12.5960   14.6955   49.450   100
 rowSumsMethod2(A2)   57.380   60.1790   63.4450   67.4100   81.172   100
    applyMethod(A1)   78.839   84.4380   92.1355   99.8330  181.005   100
    applyMethod(A2) 3996.543 4221.8645 4338.0235 4552.3825 6124.735   100

So Joshua's method wins! And apply method is clearly slower than two other methods (relatively speaking of course).

edited Mar 13 '13 at 18:42

answered Mar 13 '13 at 18:15

Jouni Helske

6,427
29
52

You can do it in one with `!is.infinite()`! – Simon O'Hanlon Mar 13 '13 at 18:15
So I'd use `sums <- apply( A , 1 , FUN = function(x){ sum(x[!is.infinite(x)])})` – Simon O'Hanlon Mar 13 '13 at 18:19
You realise that the unit of measurement is 1 millionth of a second right?! But yes, NA subsetting is quicker by 0.004 seconds for larger matrices! :-) – Simon O'Hanlon Mar 13 '13 at 18:31
This is wasting time trying to save time. – Señor O Mar 13 '13 at 18:35
2

Yes of course the differences are miniscule, I didn't think there's any meaningful differences, it's just fun to benchmark things :) – Jouni Helske Mar 13 '13 at 18:36
1

If you're going to replace values in the matrix, you could have replaced with `0` and left `na.rm=FALSE`, which would likely be faster. – Joshua Ulrich Mar 13 '13 at 18:37
+1 for having fun benchmarking and for posting the results of course ;) – Jilber Urbina Mar 13 '13 at 18:39

score 11 · Answer 3 · answered Mar 13 '13 at 18:22

11

I'd use apply and is.infinite in order to avoid replacing Inf values by NA as in @Hemmo's answer.

> set.seed(1)
> Mat <- matrix(sample(c(1:5, Inf), 50, TRUE), ncol=5)
> Mat # this is an example
      [,1] [,2] [,3] [,4] [,5]
 [1,]    2    2  Inf    3    5
 [2,]    3    2    2    4    4
 [3,]    4    5    4    3    5
 [4,]  Inf    3    1    2    4
 [5,]    2    5    2    5    4
 [6,]  Inf    3    3    5    5
 [7,]  Inf    5    1    5    1
 [8,]    4  Inf    3    1    3
 [9,]    4    3  Inf    5    5
[10,]    1    5    3    3    5
> apply(Mat, 1, function(x) sum(x[!is.infinite(x)]))
 [1] 12 15 21 10 18 16 12 11 17 17

answered Mar 13 '13 at 18:22

Jilber Urbina

58,147
10
114
138

We seem to have posted the exact same method! – Simon O'Hanlon Mar 13 '13 at 18:35
1

Im delivering +1's all round, for again illuminating many ways to do the same thing, and for making me think about the *best* way to do something simple. I like Joshua's trick. – Simon O'Hanlon Mar 13 '13 at 18:39

Simon O'Hanlon · Answer 4 · 2013-03-13T18:24:30.493

Try this...

m <- c( 1 ,2 , 3 , Inf , 4 , Inf ,5 )
sum(m[!is.infinite(m)])

Or

m <- matrix( sample( c(1:10 , Inf) , 100 , rep = TRUE ) , nrow = 10 )
sums <- apply( m , 1 , FUN = function(x){ sum(x[!is.infinite(x)])})

> m
      [,1] [,2] [,3] [,4] [,5] [,6] [,7] [,8] [,9] [,10]
 [1,]    8    9    7  Inf    9    2    2    6    1   Inf
 [2,]    8    7    4    5    9    5    8    4    7    10
 [3,]    7    9    3    4    7    3    3    6    9     4
 [4,]    7  Inf    2    6    4    8    3    1    9     9
 [5,]    4  Inf    7    5    9    5    3    5    9     9
 [6,]    7    3    7  Inf    7    3    7    3    7     1
 [7,]    5    7    2    1  Inf    1    9    8    1     5
 [8,]    4  Inf   10  Inf    8   10    4    9    7     2
 [9,]   10    7    9    7    2  Inf    4  Inf    4     6
[10,]    9    4    6    3    9    6    6    5    1     8

> sums
 [1] 44 67 55 49 56 45 39 54 49 57

score 3 · Answer 5 · edited Mar 13 '13 at 19:53

3

This is a "non-apply" and non-destructive approach:

rowSums( matrix(match(A, A[is.finite(A)]), nrow(A)), na.rm=TRUE)
[1] 2 4

Although it is reasonably efficient, it is not as fast as Johsua's multiplication method.

edited Mar 13 '13 at 19:53

Arun

116,683
26
284
387

answered Mar 13 '13 at 18:48

IRTFM

258,963
21
364
487

Okay, I think you meant `match(A, A[is.finite(A)])`. I've edited. Hope you don't mind. – Arun Mar 13 '13 at 19:54
That was not the code that worked in my session. Seems as though ti would be less efficient. – IRTFM Mar 13 '13 at 20:48
You mean my edit isn't your code? I replaced `is.finite(A)` with `A[is.finite(A)]`. Without this, `match` spits out all NAs because it matches all value with TRUE. So, only the value 1 will be matched with TRUE. Every other value will get NA. – Arun Mar 13 '13 at 20:56
I guess my test case had different results, but it was just a 2 x 2 matrix. – IRTFM Mar 14 '13 at 05:48

Omit inf from row sum in R

5 Answers5

Linked

Related