I'm comparing the convenience of dplyr
vs. data.table
in working within loops and functions.
For this, I'm trying to modify the code snippets used in this post: data.table vs dplyr: can one do something well the other can't or does poorly? so that, instead of hard-coded dataset variables names ("cut" and "price" variables of "diamonds" dataset), they become dataset-agnostic, i.e. cut-n-paste ready for the use inside any function or a loop (when we don't know column names in advance and need to access them by column number).
This is the original code:
tbl = diamonds
tbl %>%
filter(cut != "Fair") %>%
group_by(cut) %>%
summarize(
AvgPrice = mean(price)
)
I need to rewrite it so that I can use the same code in a loop like this one:
for(nVarGroup in 2:4) # Grouped by possible categorical values...
for(nVarMeans in 5:10) { # ... get means of all parameters
}
I've done it for data.table
as shown here: How to use data.table within functions and loops?.
I'm struggling however to do the same for dplyr
.
These links were recommended to resolve the problem: dplyr: How to use group_by inside a function?, https://cran.r-project.org/web/packages/dplyr/vignettes/nse.html.
However, while providing solution to group_by(strVarGroup)
line below, they do not not seem to provide solution to qGroup=quote(get(strVarGroup) %in% strGroupConditions)
line.
nVarGroup = 2 #"cut"
nVarMeans = 7 #"price"
strVarGroup = names(dt)[nVarGroup]
strVarMeans = names(dt)[nVarMeans]
qAction=quote(mean(strVarMeans))
strGroupConditions = levels(dt[[nVarGroup]])[-1] # "Good" "Very Good" "Premium" "Ideal"
qGroup=quote(get(strVarGroup) %in% strGroupConditions)
### DOES NOT WORK ###
tbl %>%
filter(eval(qGroup)) %>%
group_by(strVarGroup) %>%
summarize(
AvgPrice = eval(qAction),
)
### END: DOES NOT WORK ###
Any additional links or ideas to help?