2

The following code runs fine:

library(dplyr)
library(lazyeval)
datatable <- data.frame(f= c("Group1","Group2")
           ,a = c(100,200)
           ,b = c(400,500)
           ,c = c(50000,35000)
           ,d = c(99000,70000))

datatable %>%
      group_by(f) %>%
      mutate(p = prop.test(x=c(a, b)
                           ,n=c(c, d)
                           ,alternative = c("two.sided")
                           ,correct = FALSE)$p.value)

However, when put into a function, the code errors:

functionx <- function(datatable, f, a, b, c, d)
  {
    Table <- datatable %>%
              group_by_(f) %>%
              mutate_(p = interp(~prop.test(x=c(a, b)
                                            ,n=c(c, d)
                                            ,alternative = c("two.sided")
                                            ,correct = FALSE)$p.value))
  }

The error I am receiving reads as follows:

Error: non-numeric argument to binary operator

I've tried writing the function a few different ways (ex. a=as.name(a)). I am new to writing functions (specifically NSE/SE) - any help appreciated.

Mark Wagner
  • 363
  • 2
  • 7
  • whats `interp()` doing for you? – Nate Sep 21 '16 at 02:48
  • from library(lazyeval). dplyr uses non-standard evaluation (NSE) and I believe functions need standard evaluation (SE). interp converts from NSE to SE I believe - see http://stackoverflow.com/questions/26724124/standard-evaluation-in-dplyr-summarise-on-variable-given-as-a-character-string – Mark Wagner Sep 21 '16 at 02:52
  • hmm i've never come across that before, can you post an example dataframe? – Nate Sep 21 '16 at 02:54
  • Please make your example [reproducible](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) by supplying sample input so we can see what you are passing to the function and recreate the error. Also, explicitly list all packages you import. – MrFlick Sep 21 '16 at 03:34
  • 1
    You write "dplyr uses NSE", but in your function definition, you are actually using the SE version of group_by and mutate. For instance, if `f` is a name of a column in your `datatable`, then `group_by_("f")` should work for grouping. – jakub Sep 21 '16 at 08:49
  • I have updated the thread with a sample dataframe. @jakub dplyr uses NSE, so I have to convert to SE version of code. – Mark Wagner Sep 21 '16 at 13:20
  • Just to avoid confusion, there are two versions of most functions in `dplyr`: NSE and SE. It is easy to tell them apart: NSE versions do not have the trailing underscore. For example, `group_by` uses NSE while `group_by_` uses SE. To say that "dplyr" uses NSE is not precise: It uses both NSE and SE, depending on which version of the function you choose. – jakub Sep 22 '16 at 00:06
  • Thank you, good call on defining dplyr better than myself. – Mark Wagner Sep 22 '16 at 19:05

1 Answers1

0

I was able to find a solution to my question. Recommend reading through the dplyr vignettes document: https://cran.r-project.org/web/packages/dplyr/vignettes/nse.html

functionx <- function(datatable, facet.var, treated.take, holdout.take, treated.pop, holdout.pop)
{
  datatable %>%
    group_by_(facet.var) %>%
    mutate_(p = interp(~prop.test(x=c(a, b)
                                  ,n=c(d, e)
                                  ,alternative = c("two.sided")
                                  ,correct = FALSE)$p.value
                            ,a = as.name(treated.take)
                            ,b = as.name(holdout.take)
                            ,d = as.name(treated.pop)
                            ,e = as.name(holdout.pop))

    )
}

datatablex <- data.frame(facet.varx= c("Group1","Group2")
           ,treated.takex = c(100,200)
           ,holdout.takex = c(400,500)
           ,treated.popx = c(50000,35000)
           ,holdout.popx = c(99000,70000))

functionx(datatablex, "facet.varx", "treated.takex", "holdout.takex", "treated.popx", "holdout.popx")

There were 2 issues:

  1. The function did not like me referring to a function variable using the letter "c". Not sure why, possibly because it represents a vector?

  2. Adding "a = as.name(treated.take)" ect. seemed to correct a lot of the issues. It is not necessary to rename treated.take = a, it could be written as treated.take = as.name(treated.take).

Mark Wagner
  • 363
  • 2
  • 7