4

I can't figure out how to use an is.na(x) like function for infinite numbers in R with a data table or show per column how many Inf's there are: colSums(is.infinite(x))

I use the following example data set:

DT <- data.table(a=c(1/0,1,2/0),b=c("a","b","c"),c=c(1/0,5,NA))
DT
     a b   c
1: Inf a Inf
2:   1 b   5
3: Inf c   NA
colSums(is.na(DT))
a b c 
0 0 1 
colSums(is.infinite(DT))
Error in is.infinite(DT) : default method not implemented for type 'list'
DT[is.na(DT)] <- 100
 DT
     a b   c
1: Inf a Inf
2:   1 b   5
3: Inf c 100

DT[is.infinite(DT)] <- 100
Error in is.infinite(DT) : default method not implemented for type 'list'

I found in this post how to replace Inf with NA, but I would say there should be nicer way of achieving this, with is.infinite for example. And I would like to see the Inf's per column, any ideas about this?

Many thanks. BR Tim

Community
  • 1
  • 1
Tim_Utrecht
  • 1,459
  • 6
  • 24
  • 44

1 Answers1

6

is.finite and is.infinite don't have a data.frame or a data.table methods like is.na has (compare methods(is.infinite) vs methods(is.na))

You could alternatively loop thru the columns and then use colSums

DT[, colSums(sapply(.SD, is.infinite))]
# a b c 
# 2 0 1 

Alternatively, you could use Reduce instead of colSums

DT[, Reduce(`+`, lapply(.SD, is.infinite))]
## [1] 2 0 1

Another option is to create your own custom function and then just loop it over the columns

Myfunc <- function(x) sum(is.infinite(x))
DT[, lapply(.SD, Myfunc)]
#    a b c
# 1: 2 0 1

Of course you could also write data.frame method for is.infinite as it appears to be generic (see ?is.infinite).

David Arenburg
  • 91,361
  • 17
  • 137
  • 196
  • 2
    Also, `colSums(DT=='Inf', na.rm=TRUE)` would work, though not elegant – akrun May 20 '15 at 16:47
  • @akrun Yeah, I always forget this works too, though I'm not sure why. – David Arenburg May 20 '15 at 17:41
  • The quote is not needed though, I think it must work in the same way as DT==1 or other value – akrun May 20 '15 at 17:45
  • 1
    @akrun it still looks somewhat fishy to me. The documentation clearly says "*do not test equality to NaN*", though it doesn't mention anything regarding `Inf` – David Arenburg May 20 '15 at 17:57
  • @akrun, @Frank just pointed out that we need to cover `-Inf` too. Completely slipped my mind – David Arenburg May 20 '15 at 18:38
  • Thanks @DavidArenburg and @akrun! What is your preferred solution to replace Inf with NA? I found the method in the link I provided in the question not so elegant... – Tim_Utrecht May 21 '15 at 06:11
  • I think the one from the previous post is not that ugly though.. : `for (j in 1:ncol(DT)) set(DT, which(is.infinite(DT[[j]])), j, NA)` – Tim_Utrecht May 21 '15 at 06:19
  • The `set` method is very efficient, but is hard to read. If your data set no too bug, i would go with a simple `replace`. For example: create a custom function `Myfunc <- function(x) replace(x, is.infinite(x), NA)`, then simply loop it over the columns and update by reference `DT[, names(DT) := lapply(.SD, Myfunc)]` – David Arenburg May 21 '15 at 06:42