14

Consider this simple example

library(dplyr)
library(ggplot2)

dataframe <- data_frame(id = c(1,2,3,4),
                        group = c('a','b','c','c'),
                        value = c(200,400,120,300))

# A tibble: 4 x 3
     id group value
  <dbl> <chr> <dbl>
1     1     a   200
2     2     b   400
3     3     c   120
4     4     c   300

Here I want to write a function that takes the dataframe and the grouping variable as input. Ideally, after grouping and aggregating I would like to print a ggpplot chart.

This works:

get_charts2 <- function(data, mygroup){

  quo_var <- enquo(mygroup)

  df_agg <- data %>% 
    group_by(!!quo_var) %>% 
    summarize(mean = mean(value, na.rm = TRUE),
              count = n()) %>% 
    ungroup()

  df_agg
}



> get_charts2(dataframe, group)
# A tibble: 3 x 3
  group  mean count
  <chr> <dbl> <int>
1     a   200     1
2     b   400     1
3     c   210     2

Unfortunately, adding ggplot into the function above FAILS

 get_charts1 <- function(data, mygroup){

  quo_var <- enquo(mygroup)

  df_agg <- data %>% 
    group_by(!!quo_var) %>% 
    summarize(mean = mean(value, na.rm = TRUE),
              count = n()) %>% 
  ungroup()

  ggplot(df_agg, aes(x = count, y = mean, color = !!quo_var, group = !!quo_var)) + 
    geom_point() +
    geom_line() 
}


> get_charts1(dataframe, group)
Error in !quo_var : invalid argument type

I dont understand what is wrong here. Any ideas? Thanks!

EDIT: interesting follow-up here how to create factor variables from quosures in functions using ggplot and dplyr?

Tung
  • 26,371
  • 7
  • 91
  • 115
ℕʘʘḆḽḘ
  • 18,566
  • 34
  • 128
  • 235

2 Answers2

12

ggplot does not yet support tidy eval syntax (you can't use the !!). You need to use more traditional standard evaluation calls. You can use aes_q in ggplot to help with this.

get_charts1 <- function(data, mygroup){

  quo_var <- enquo(mygroup)

  df_agg <- data %>% 
    group_by(!!quo_var) %>% 
    summarize(mean = mean(value, na.rm = TRUE),
              count = n()) %>% 
    ungroup()

  ggplot(df_agg, aes_q(x = quote(count), y = quote(mean), color = quo_var, group = quo_var)) + 
    geom_point() +
    geom_line() 
}


get_charts1(dataframe, group)
MrFlick
  • 195,160
  • 17
  • 277
  • 295
  • thanks! really neat. Do you mind just explaining why I need to `quote()` `count` but I can keep `quo_var` as is? – ℕʘʘḆḽḘ Aug 22 '17 at 18:21
  • 3
    Because `aes_q` expects symbols (or variables that resolve to symbols). In this case you want to literally just use `count`, not a variable named "count" so you quote it. But `quo_var` is a variable that contains the symbol-like expression `group` so you need to evaluate that variable. – MrFlick Aug 22 '17 at 18:23
  • thanks, I have to admit this is confusing as hell.. Need to ponder about this for a while... :D thanks again!!!! – ℕʘʘḆḽḘ Aug 22 '17 at 18:24
  • just a quick follow up. Is it possible to use `factor` on a `enquote`? say something like `ggplot(df_agg, aes_q(x = quote(count), y = quote(mean), color = quo_var, group = factor(quo_var)))` Problem is, my variable is taken as a numeric and I want a factor instead.. thanks again!! – ℕʘʘḆḽḘ Aug 22 '17 at 19:23
  • maybe that is worth another question? this code actually fails if we use a factor variable... I have no idea why – ℕʘʘḆḽḘ Aug 22 '17 at 19:35
  • 2
    That's a more complicated problem which would probably be better addressed with a separate question. Something like `color = bquote(factor(.(quo_var[[2]]))))` might work. – MrFlick Aug 22 '17 at 19:36
  • insane! thanks. yes, let me ask another question as a follow up. thanks!! – ℕʘʘḆḽḘ Aug 22 '17 at 19:37
  • https://stackoverflow.com/questions/45826042/how-to-create-factor-variables-from-quosures-in-functions-using-ggplot-and-dplyr – ℕʘʘḆḽḘ Aug 22 '17 at 20:00
7

ggplot2 v3.0.0 released in July 2018 supports !! (bang bang), !!!, and :=. aes_()/aes_q() and aes_string() are soft-deprecated.

OP's original code should work

library(tidyverse)

get_charts1 <- function(data, mygroup){

  quo_var <- enquo(mygroup)

  df_agg <- data %>% 
    group_by(!!quo_var) %>% 
    summarize(mean = mean(value, na.rm = TRUE),
              count = n()) %>% 
    ungroup()

  ggplot(df_agg, aes(x = count, y = mean, 
                color = !!quo_var, group = !!quo_var)) + 
    geom_point() +
    geom_line() 
}

get_charts1(dataframe, group)

Edit: using the tidy evaluation pronoun .data[] to slice the chosen variable from the data frame also works

get_charts2 <- function(data, mygroup){

  df_agg <- data %>% 
    group_by(.data[[mygroup]]) %>% 
    summarize(mean = mean(value, na.rm = TRUE),
              count = n()) %>% 
    ungroup()

  ggplot(df_agg, aes(x = count, y = mean, 
                     color = .data[[mygroup]], group = .data[[mygroup]])) + 
    geom_point() +
    geom_line() 
}

get_charts2(dataframe, "group")

Created on 2018-04-04 by the reprex package (v0.2.0).

Tung
  • 26,371
  • 7
  • 91
  • 115