4

I can'n solve one question. Want to write a function like this:

   f.describe <- function(.data, .var, .by)
   {
        require(plyr)

        res <- ddply(.data, .by, 
    summarize, 
    N = sum(!is.na(.var))
    `Mean (SD)`=sprintf("%5.2f (%5.2f)", 
                       mean(.var, na.rm=TRUE), sd=sd(.var, na.rm=TRUE)),
     Median = sprintf("%5.2f", median(.var))    
        )
    res
   }

But I can't find the way to pass a variable for processing (.var). Have this error:

Error eval(expr, envir, enclos) : object '.var' was not found (translated from other language, so it could be not verbatim for English users)

I tried eval, substitute but have no solution... Many thanks for help. Sometimes I do not understand the rules R uses for evaluation.

crow16384
  • 587
  • 3
  • 15
  • plyr does have known issues with scoping of variables inside functions. – baptiste May 01 '13 at 14:22
  • You mean . (dot) function. I know, but walking around I still have no solution... – crow16384 May 01 '13 at 14:30
  • unfortunately I cannot find the discussion I had when I got stuck on a similar problem. My ugly workaround was to use `<<-` to push a local variable into the global environment where plyr could find it... – baptiste May 01 '13 at 14:38
  • @baptiste: Thanks for an advise. I do not want use global variables. Possible to find ugly solutions with parse, but I belive that there are some beautiful and in the language rules :) – crow16384 May 01 '13 at 14:43
  • 2
    [related question and discussion](http://stackoverflow.com/questions/6955128/object-not-found-error-with-ddply-inside-a-function) (including comment from @hadley, suggesting that Jan van der Laan 's solution below is the preferred workaround) – baptiste May 01 '13 at 15:26

1 Answers1

5

In this case it might be easier to pass a function to ddply instead of using summarize:

f.describe <- function(.data, .var, .by) {
    require(plyr)

    ddply(.data, .by, function(d) {
       c(N = sum(!is.na(d[[.var]])),
       `Mean (SD)`=sprintf("%5.2f (%5.2f)", 
           mean(d[[.var]], na.rm=TRUE), 
           sd=sd(d[[.var]], na.rm=TRUE)),
       Median = sprintf("%5.2f", median(d[[.var]])))    
    })
}
Jan van der Laan
  • 8,005
  • 1
  • 20
  • 35