3

I have a general question that I couldn't find a satisfactory answer for. I'm building a set of visualization functions, and I want to let the user have flexibility when using them. For example, I'd like to keep it optional whether errorbars should be included in a bar plot, or whether labels in geom_text() will be in percent or decimal.

If we think of a typical construction of code in ggplot(), we have elements separated by +. So if I want to allow optional construction, I'd likely need to "turn on" or "turn of" either entire geoms (e.g., completely ignore geom_errorbar() if user doesn't want errorbars in plot), or otherwise tweak within geoms (e.g., changing the only label argument within geom_text() to convert labels to percent or keep in decimals).

I hope my question doesn't invite too opinion-based answers, but rather have people lay out the standard/typical way of coding optional arguments when wrapping ggplot() with customized functions.

Example

I came up with a solution that I dislike. I think it makes code long and hard to read. I also can't say much on whether it's efficient or not computationally-wise.
Say that I want to build a custom function for a bar chart. There are several things that I wish them to be "tweakable":

  1. Whether bars should be ordered or not (see reorder_cols argument)
  2. Whether to provide user's own set of x axis labels (see x_axis_labels argument)
  3. Whether to add errorbars (add_errorbar)
  4. Whether to show bar labels in percent (show_in_percents)

Then I assign each optional code into a variable, and use a conditional to determine which piece of code should be included, according to bar_chart()'s relevant argument.

library(tidyverse)
library(broom)
#> Warning: package 'broom' was built under R version 4.0.3

bar_chart <- function(data, x_var, y_value, reorder_cols = TRUE, x_axis_labels = NULL, add_errorbar = NULL) {
  
  reordered <- geom_bar(stat = "identity", width = 0.8, aes(x = reorder({{ x_var }}, -{{ y_value }} ), y = {{ y_value }}, fill = as_factor({{ x_var }})))
  not_reordered <- geom_bar(stat = "identity", width = 0.8, aes(x = {{ x_var }}, y = {{ y_value }}, fill = as_factor({{ x_var }})))
  
  with_x_axis_labels <- scale_x_discrete(labels = setNames(str_wrap(x_axis_labels, 10), names(x_axis_labels)), expand = c(0, 0))
  without_x_axis_labels <- scale_x_discrete(expand = c(0, 0))
  
  if (reorder_cols == TRUE) {
    my_geom_bar <- reordered
  } else {
    my_geom_bar <- not_reordered
  }
  
  if (is.null(x_axis_labels)) {
    my_scale_x_discrete <- without_x_axis_labels
  } else {
    my_scale_x_discrete <- with_x_axis_labels
  }
  
  if (add_errorbar == TRUE) {
    my_errorbar <-  geom_errorbar(aes(x = {{ x_var }}, y = {{ y_value }}, ymin = errorbar_lwr, ymax = errorbar_upr), width = 0.1, size = 0.75)
  } else {
    my_errorbar <- NULL
  }
  
  
  ggplot(data) +
    my_geom_bar +
    my_errorbar +
    my_scale_x_discrete
  
}


labels_for_barplot <-
  c("bar_1", "bar_2", "bar_3")

mtcars %>%
  lm(mpg ~ factor(carb), data = .) %>%
  broom::tidy() %>%
  mutate(errorbar_lwr = estimate - std.error,
         errorbar_upr = estimate + std.error) %>%
  bar_chart(data = ., x_var = term, y_value = estimate, reorder_cols = TRUE, add_errorbar = TRUE, x_axis_labels = labels_for_barplot)

Created on 2021-01-17 by the reprex package (v0.3.0)

All in all, I gave this example to ask whether there are alternative, more concise ways of achieving optional arguments in a wrapper for ggplot.


EDIT


As @Tjebo noted, my code was buggy and simply won't run. I updated it, also removing the part about geom_text() as it was too confusing.

Emman
  • 3,695
  • 2
  • 20
  • 44

1 Answers1

1

This will not attempt to answer all questions (as there are several), but just to demonstrate the principle which you could make use of. Check out the ggplot book on programming with ggplot2

The idea is to create a list which contains all ggplot objects (such as aes, geom, scale). Objects that are returned NULL will be simply discarded. That's the whole beauty.

I have removed the scale because it was somewhat difficult to understand what you wanted to achieve. The idea would be very similar. And actually generally reduced the entire problem to what I believe is the gist of the question.

library(tidyverse)

bar_chart <- function(data, xvar, yvar,
                      se = TRUE, show_percents = TRUE,
                      myscale = TRUE) {
  newy <- deparse(substitute(yvar))
  if (show_percents) {
    my_label <- paste0(100 * round(data[[newy]], 2), "%")
  } else {
    my_label <- round(data[[newy]], 2)
  }

  ggplot({{data}}, aes({{xvar}}, {{yvar}})) +
    list(
      geom_col(width = 0.8),
      geom_text(vjust = 1.4, color = "white", size = 6, fontface = "bold", label = my_label),
      if (se) geom_errorbar(aes(ymin = {{yvar}} - .1, ymax = {{yvar}} + .1), width = 0.1, size = 0.75)
    )
}

iris2 <- iris %>%
  group_by(Species) %>%
  slice_max(Sepal.Length)

bar_chart(iris2, Species, Sepal.Length)

bar_chart(iris2, Species, Sepal.Length, se = FALSE)

Created on 2021-01-17 by the reprex package (v0.3.0)

tjebo
  • 21,977
  • 7
  • 58
  • 94
  • @emman if you wonder about the percent labels - this is a result of your exact code! – tjebo Jan 17 '21 at 13:36
  • Thank you. This raises several follow-up questions. First, am I correct when inferring that while wanting to turn on or off entire `geoms` (such as `geom_errorbar()`) it can be achieved with `if (se) geom_errorbar(...)`, **but** tweaking arguments within `geoms` needs separate conditional assignment (e.g. `my_label`)? – Emman Jan 17 '21 at 13:58
  • @emman I think this very much depends on the design of your function and how much you want the user to be able to tweak things. the book I linked to actually addresses this problem. another option may be the use of ‘...’ – tjebo Jan 17 '21 at 14:02
  • I see, OK I shall read more thoroughly through the book. Second question has to do with your choice to use `newy <- deparse(substitute(yvar))`. Although I'm not sure whether this question has to do with `ggplot2` or evaluation in general. Is this for addressing the problem me of using `{{ yvar }}` off the bat within `geom_text()`? In other words, when do we use `deparse(substitute(yvar))` and when we simply `{{ yvar }}` – Emman Jan 17 '21 at 14:09
  • simple answer would be {{ is reserved for within specific tidyverse objects such as aes() - this may be oversimplification though – tjebo Jan 17 '21 at 14:59