1

I would like to change all 0's to, say 0.0001, in a list of dataframes to avoid -Inf when take log. So followed the instruction from Replace all 0 values to NA, I wrote my function as

set_zero_as_value <- function(x, value=0.0001){
    x[x == 0] <- value
}

However when I use sapply to my data sapply(a,set_zero_as_value), the result returned as

   s1    s2 
1e-04 1e-04 

And further check the list a, the 0 in a does not change at all. Is there a solution for this?

PS: list a can be created as

> a = NULL
> a$s1 = rbind(cbind(0,1,2),cbind(3,4,5))
> a$s2 = rbind(cbind(0,1,2),cbind(3,4,5))
Community
  • 1
  • 1
lolibility
  • 2,187
  • 6
  • 25
  • 45
  • 1
    Your function needs to return `x`. – mrip Oct 25 '13 at 14:47
  • And (hopefully I don't need to point this out) you need to assign the result to something, e.g. `a <- sapply(a,set_zero_as_value)` – Simon O'Hanlon Oct 25 '13 at 14:48
  • So I must to assign the results? Can I simply modify a, since if the dataframe is very large, there would be redundancy. Or I just rm it later. – lolibility Oct 25 '13 at 14:51
  • 1
    @lolibility use a `data.table` from the `data.table` package to *pass-by-reference*. R is obstensibly *pass-by-value* and will perform copy-on-modify of almost all objects you use. – Simon O'Hanlon Oct 25 '13 at 14:58

1 Answers1

2

Use pmax inside of lapply call, no need to define set_zero_as_value since pmax does what you need. Let's suppose your list is:

list.DF <-list(structure(list(a = c(1L, 2L, 3L, 5L, 1L, 5L, 5L, 3L, 3L, 
0L), b = c(1L, 1L, 4L, 2L, 4L, 2L, 4L, 5L, 2L, 4L), c = c(5L, 
1L, 3L, 0L, 1L, 2L, 0L, 2L, 5L, 2L)), .Names = c("a", "b", "c"
), row.names = c(NA, -10L), class = "data.frame"), structure(list(
    d = c(2L, 3L, 2L, 1L, 4L, 4L, 4L, 0L, 4L, 2L), e = c(4L, 
    3L, 4L, 3L, 3L, 4L, 0L, 2L, 4L, 4L), f = c(2L, 5L, 2L, 1L, 
    0L, 0L, 1L, 3L, 3L, 2L)), .Names = c("d", "e", "f"), row.names = c(NA, 
-10L), class = "data.frame"))

Now applying your desired transformation:

> lapply(list.DF, function(x) sapply(x, pmax, 0.0001))

If you want to use your set_zero_as_value function, then add return(x) at the end of it.

set_zero_as_value <- function(x, value=0.0001){
  x[x == 0] <- value
  return(x)
}

lapply(list.DF, function(x) sapply(x, set_zero_as_value))

This will produce the same result as before.

Jilber Urbina
  • 58,147
  • 10
  • 114
  • 138
  • 1
    +1 but... your `pmax` solution will also change values in the data that are `x > 0 & x < 0.0001` which *may* be undesirable. – Simon O'Hanlon Oct 25 '13 at 15:01