4

I am running a simulation that I need to keep track of number of occurrences in a function call of a particular condition. I attempted to accomplish this with an assignment to a global object. It works if you run the function but if you try to lapply the function as I'm doing then you get a single count of all the times the condition happened rather than a count for every time it happened for each element in the list fed to lapply.

Here's a dummy situation where the occurrence is evenness of a number:

FUN <- function(x){
    lapply(1:length(x), function(i) {
        y <- x[i]
        if (y %% 2 == 0){
            assign("count.occurrences", count.occurrences + 1, env=.GlobalEnv)   
        }
        print("do something")
    })
    list(guy="x", count=count.occurrences)
}

#works as expected
count.occurrences <- 0
FUN(1:10)


count.occurrences <- 0  
lapply(list(1:10, 1:3, 11:16, 9), FUN) 

#gives me...
#> count.occurrences
#[1] 9

#I want...
#> count.occurrences
#[1] 5  1  3  0

It's in a simulation so speed is an issue. I want this to be as fast as possible so I'm not married to the global assignment idea.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519
  • 3
    You want us to improve on your one-liner solution? – Andrie Aug 07 '12 at 15:41
  • @Andrie, thanks for the comment (+1), I couldn't think of how to do it at all when I started typing the question because I didn't think about how it was returning `count.occurances` in the list but had typed the question when it occurred to me. I figured the global assignment was costly anyway and there was a better way to do it. – Tyler Rinker Aug 07 '12 at 16:08
  • If this is the best I can do I'll remove the edit to my question and add it as an answer. – Tyler Rinker Aug 07 '12 at 16:09
  • You could probably also do this using closures, but my view is that if you have a simple, working solution then just stick with it. – Andrie Aug 07 '12 at 16:21

3 Answers3

8

Rather than assign to the global environment, why not just assign to inside FUN's environment?

FUN <- function(x){
    count.occurances <- 0
    lapply(1:length(x), function(i) {
        y <- x[i]
        if (y %% 2 == 0){
            count.occurances <<- count.occurances + 1
        }
        print("do something")
    })
    list(guy="x", count=count.occurances)
}

Z <- lapply(list(1:10, 1:3, 11:16, 9), FUN) 

Then you can just pull the counts out.

> sapply(Z, `[[`, "count")
[1] 5 1 3 0
Brian Diggs
  • 57,757
  • 13
  • 166
  • 188
  • I think this is the route to go though benchmarking any of them inside of this simulation is outside of the question. – Tyler Rinker Aug 07 '12 at 19:52
2

I haven't done any benchmarking on this, but have you tried just using a for loop? I know that loops aren't generally encouraged in R, but they're also not always slower.

FUN <- function(x) {
  count.occurrences = 0
  for (i in 1:length(x)) {
    y = x[i]
    if (y %% 2 == 0) {
      count.occurrences = count.occurrences + 1
    }
    print("do something")
  }
  list(guy="x", count=count.occurrences)
}

lapply(list(1:10, 1:3, 11:16, 9), FUN)
A5C1D2H2I1M1N2O1R2T1
  • 190,393
  • 28
  • 405
  • 485
  • Oh, and I fixed the spelling of occurrences in this function ;-) – A5C1D2H2I1M1N2O1R2T1 Aug 07 '12 at 16:26
  • Good point mrdwab, I showed this very thing you're talking about here: http://stackoverflow.com/questions/10467322/loops-in-r-need-to-use-index-anyway-to-avoid-for/10467375#10467375 – Tyler Rinker Aug 07 '12 at 16:26
  • I was too lazy to fix the spelling after I posted. In R it doesn't catch the spelling but in the title of the post it did (I use chrome) so I fixed it there. :) – Tyler Rinker Aug 07 '12 at 16:27
  • mrdwabb this is already a repeat loop and heavily nested at that (I'm working with someone else's code and that's how they work), would the method you propose work if you had a loop that was nested at least three times within a loop? (I rarely use loops so don't know the answer to this question) – Tyler Rinker Aug 07 '12 at 16:29
  • @TylerRinker, sorry, but I'm no expert on loops and efficiency, particularly with nested loops, so I don't have the answer to this question either. – A5C1D2H2I1M1N2O1R2T1 Aug 07 '12 at 17:33
0

I can get it like this:

count.occurances <- 0  
Z <-lapply(list(1:10, 1:3, 11:16, 9), FUN) 
diff(c(0, sapply(1:length(Z), function(x) Z[[x]]$count)))

I'm open to better ideas (faster).

Tyler Rinker
  • 108,132
  • 65
  • 322
  • 519