if function comes across 0, do nothing in R

Question

I have this code:

df[, -1] = apply(df[, -1], 2, function(x){x * log(x)})

df looks like:

sample a b  c
a2     2 1  2
a3     3 0 45

The problem I am having is that some of my values in df are 0. You cannot take the ln(0). So I would like tell my program to spit out a 0 if it tries to take ln(0).

One option is to add an amount less than floating point error to `x` so it won't noticeably change the results but will run fine, e.g.: `df[-1] <- lapply(df[-1], function(x){x * log(x + .Machine$double.xmin)})` — alistaire, Jan 08 '18 at 02:22
"So I would like tell my program to spit out a 0 if it tries to take ln(0)." So, you want to get a wrong result from an arithmetic operation? That sounds dangerous. — Roland, Jan 08 '18 at 06:59

score 2 · Answer 1 · answered Jan 08 '18 at 02:08

2

You could use ifelse here:

df[,-1] = apply(df[,-1], 2, function(x){ ifelse(x != 0, x*log(x), 0) })

answered Jan 08 '18 at 02:08

Tim Biegeleisen

502,043
27
286
360

score 0 · Answer 2 · answered Jan 08 '18 at 02:59

You can take advantage of floating point error to add a tiny amount less than the floating point error to x. Since log(0.00000000000000...0000223) is 0.0000..., inputting 0 will work. The results of other numbers will only be changed by amounts smaller than the floating point error, meaning for practical purposes not at all.

Avoiding the iteration and using .Machine$double.xmin for a very, very small number,

df <- data.frame(sample = c("a2", "a3"), 
                 a = 2:3, 
                 b = c(1L, 0L), 
                 c = c(2L, 45L))

df
#>   sample a b  c
#> 1     a2 2 1  2
#> 2     a3 3 0 45

df[-1] <- df[-1] * log(df[-1] + .Machine$double.xmin)

df
#>   sample        a b          c
#> 1     a2 1.386294 0   1.386294
#> 2     a3 3.295837 0 171.299812

To check the results, let's use another approach, changing 0 values to 1 so they're return 0:

df2 <- data.frame(sample = c("a2", "a3"), 
                 a = 2:3, 
                 b = c(1L, 0), 
                 c = c(2L, 45L))

df2[df2 == 0] <- 1
df2[-1] <- df2[-1] * log(df2[-1])

df2
#>   sample        a b          c
#> 1     a2 1.386294 0   1.386294
#> 2     a3 3.295837 0 171.299812

Because the change is less than floating point error, the results are identical according to R:

identical(df, df2)
#> [1] TRUE

if function comes across 0, do nothing in R

2 Answers2