16

I'm confused about how to pass function argument into dplyr and ggplot codes. I'm using the newest version of dplyr and ggplot2 Here is my code to produce a barplot (clarity vs mean price)

diamond.plot<- function (data, group, metric) {
    group<- quo(group)
    metric<- quo(metric)
    data() %>% group_by(!! group) %>%
           summarise(price=mean(!! metric)) %>% 
           ggplot(aes(x=!! group,y=price))+
           geom_bar(stat='identity') 
}

diamond.plot(diamonds, group='clarity', metric='price')

error:

Error in UseMethod("group_by_") : no applicable method for 'group_by_' applied to an object of class "packageIQR"

For the newest version of dplyr, the underscored verbs_() is softly deprecated. It seems like we should use quosures.

my questions:

  • Can someone clarify the current best practice for this?
  • what was wrong with the above code? (no underscore dplyr verbs please..)

  • In ggplot, I know we can use aes_string(), but in my case, only one of the parameter in the aes is passed from function argument.

Thanks in advance.

fmic_
  • 2,281
  • 16
  • 23
zesla
  • 11,155
  • 16
  • 82
  • 147

5 Answers5

8

Tidy evaluation is now fully supported in ggplot2 v3.0.0 so it's not necessary to use aes_ or aes_string anymore.

library(rlang)
library(tidyverse)

diamond_plot <- function (data, group, metric) {
    quo_group  <- sym(group)
    quo_metric <- sym(metric)

    data %>%
        group_by(!! quo_group) %>%
        summarise(price = mean(!! quo_metric)) %>%
        ggplot(aes(x = !! quo_group, y = !! quo_metric)) +
        geom_col()
}

diamond_plot(diamonds, "clarity", "price")

Created on 2018-04-16 by the reprex package (v0.2.0).

Tung
  • 26,371
  • 7
  • 91
  • 115
6

I don't think you can that the "correct" way quite yet, as ggplot2 doesn't support the tidyeval syntax, but it's coming.

The best practice with the dplyr part of the code would be:

library(tidyverse)
library(rlang)

diamond_data <- function (data, group, metric) {
   quo_group <- enquo(group)
   quo_metric <- enquo(metric)
   data %>%
     group_by(!!quo_group) %>%
     summarise(price=mean(!!quo_metric))
}
diamond_data(diamonds, clarity, price)

To work around the lack of support of the tidyeval in ggplot2, you could do (note the quotes around the variables in the function call):

diamond_plot <- function (data, group, metric) {
    quo_group <- parse_quosure(group)
    quo_metric <- parse_quosure(metric)
    data %>%
        group_by(!!quo_group) %>%
        summarise(price=mean(!!quo_metric)) %>%
        ggplot(aes_(x = as.name(group), y=as.name(metric)))+
        geom_bar(stat='identity')
}
diamond_plot(diamonds, "clarity", "price")

EDIT -- Following @lionel's comment:

diamond_plot <- function (data, group, metric) {
    quo_group <- sym(group)
    quo_metric <- sym(metric)
    data %>%
        group_by(!!quo_group) %>%
        summarise(price=mean(!!quo_metric)) %>%
        ggplot(aes_(x = quo_group, y= quo_metric)) +
        geom_bar(stat='identity')
}
diamond_plot(diamonds, "clarity", "price")
fmic_
  • 2,281
  • 16
  • 23
  • I would just use `sym()` (or `as.name()`) instead of `parse_quosure()` in that case. – Lionel Henry Aug 02 '17 at 07:25
  • thanks for the suggestion @lionel, I updated my answer – fmic_ Aug 02 '17 at 13:44
  • @lionel could you explain a little bit on why you prefer sym() over parse_quosure? – zesla Aug 02 '17 at 16:41
  • 1
    Because the function is expecting column names, as opposed to symbols that might refer to contextual objects. Parsing will also create expressions, not just symbols, which is not appropriate here. It's often bad style to parse arbitrary code. Finally the symbols can be passed to both the dplyr functions and the `aes_()` function while `aes_()` doesn't support quosures. Using symbols instead of quosured symbols of course assumes that the names refer to columns and not to objects from the context, which seems reasonable here. – Lionel Henry Aug 02 '17 at 16:50
4

The most "tidyeval" way to this problem to me looks as combination of quo_name and aes_string functions. Avoid using trailing underscore verbs like aes_ since they're getting deprecated.

diamond_plot <- function(data, group, metric) {
  quo_group <- enquo(group)
  str_group <- quo_name(quo_group)

  quo_metric <- enquo(metric)

  summary <- data %>%
     groupby(!!quo_group) %>%
     summarise(mean = mean(!!quo_metric))

  ggplot(summary) +
  geom_bar(aes_string(x = str_group, y = "mean"), stat = "identity")
}

diamond_plot(diamnonds, clarity, price)
Stormwalker
  • 351
  • 1
  • 12
3

sinQueso's answer is promising but it misses the purpose of a function, which is to be adaptable to different data frames. The "price" variable is encoded in the function in the following line:

summarise(price=mean(!!quo_metric)) %>%

so this function will only work if the input variable is "price".

Here is a better solution that can be used for any data frame:

diamond_plot <- function (data, group, metric) {
        quo_group <- sym(group)
        quo_metric <- sym(metric)
        summary <- data %>%
                group_by(!!quo_group) %>%
                summarise(mean=mean(!!quo_metric))
                ggplot(summary, aes_string(x = group, y= "mean")) +
                geom_bar(stat='identity')
}
diamond_plot(diamonds, "clarity", "price")
Daniel Yudkin
  • 494
  • 4
  • 11
3

You can go even further than Daniel's solution so that the name of the summary variable (metric) changes with the input.

diamond_plot <- function(data, group, metric) {
    quo_group <- rlang::sym(group)
    quo_metric <- rlang::sym(metric)
    metric_name <- rlang::sym(stringr::str_c("mean_", metric))
    data %>%
        group_by(!!quo_group) %>%
        summarize(!!metric_name := mean(!!quo_metric)) %>%
        ggplot(aes_(x = quo_group, y = metric_name)) +
        geom_bar(stat = 'identity')
}
diamond_plot(diamonds, "clarity", "price")