I am relying on the compareGroups
package to do some comparisons after a pipe-chain. When subsetting the final results, the call to [
triggers a call to update
(both in their bespoke compareGroups
-versions) which leads to a scoping problem.
Try this:
library(tidyverse)
# install.packages("compareGroups")
library(compareGroups)
get_data <- function() return(mtcars)
assign_group <- function(df) {
n <- nrow(df)
df$group <- rbinom(n, 1, 0.5)
return(df)
}
get_results <- function(){
get_data() %>% assign_group %>% compareGroups(group ~ ., data = .)
}
res <- get_results()
# all the above works, but the following triggers the error:
res["mpg"]
This leads to the following error:
Error in compareGroups(formula = group ~ mpg, data = .) : object '.' not found
The relevant (abbreviated) traceback is this:
compareGroups(formula = group ~ mpg, data = .)
eval(call, parent.frame())
update.compareGroups(x, formula = group ~ mpg)
update(x, formula = group ~ mpg) at <text>#1
eval(parse(text = cmd))
`[.compareGroups`(res, "mpg")
res["mpg"]
So, my understanding is that that the dot-notation in the dplyr
pipe-chain prevents the update-call to find the dataframe, which is stored as .
in the call. So, the error makes sense as neither .
is not the name of the dataframe, nor available outside of the scope of the function get_results
(though the main issue is the .
). One obvious way of avoiding this error is by fixing the update.compareGroups
function - I don't think we need another call to the package to redo all calculations when I simply want to retrieve individual results (which have already been calculated).
However, this is a more general issue with the .
notation of dplyr
and the fact it is stored in the call. This problem seems general enough so that I would imagine someone has encountered it before, and has found a more general solution?