36

I want to set the string N=xxx as the title of my figure, where xxx is the number of observations in the data frame that I pass as the data argument to ggplot(). In my current code, I explicitly pass that data frame a second time as an argument to sprintf() which I use inside of labs():

ggplot(mtcars, aes(mpg, hp)) + 
    labs(title=sprintf("N=%i", nrow(mtcars))) + 
    geom_point()

This does produce the desired title, but it won't work with more complex tasks: I use a dplyr pipe to construct the data frame that is being plotted, and as this is a time-consuming process, I wouldn't want to repeat the pipe a second time to obtain the number of rows like in the example.

So, how do I access the data frame that has been passed as an argument to ggplot() from within the argument specifications of the functions that are used to modify the plot?

Schmuddi
  • 1,995
  • 21
  • 35

2 Answers2

58
mtcars %>% {
  ggplot(., aes(mpg, hp)) + 
  labs(title = paste("N =", nrow(.))) + 
  geom_point()
}

Note that when wrapping the whole ggplot call in {...} curly braces, you must use the . dot pronoun for the data argument in ggplot(., ...). Then you can call back that object using the . pronoun anywhere in the call.

enter image description here

Brian
  • 7,900
  • 1
  • 27
  • 41
  • Excellent, this is exactly what I was looking for. I vaguely remembered something about using `.`, but didn't know where to look that up. Thanks to your answer, I found that this is described in the `magrittr` vignette: https://cran.r-project.org/web/packages/magrittr/vignettes/magrittr.html – Schmuddi Jul 13 '17 at 18:42
  • You can also use `sprintf` instead of `paste` of course, for more control over the formatting. – Brian Jul 13 '17 at 18:43
  • This is very nice, I didn't think of using {} to create an "function" like this. – rsmith54 Oct 30 '19 at 21:00
  • 1
    @rsmith54, I find myself doing it all the time with `%>%` and non-tidyverse functions that put the `data` argument later on, like `lmer`. – Brian Oct 31 '19 at 00:01
  • 1
    Yeah, it is a really nice trick. I wonder if it would be useful to add some information on the mechanism to the answer as well. – rsmith54 Oct 31 '19 at 13:41
  • Why does this not work with "ggcorr" which is also based on "ggplot2"? – Ömer An Jan 17 '23 at 03:12
  • @ÖmerAn I've never used `ggcorr` but looking at its documentation, I see no reason it shouldn't work the same. Can you open a new question with your code that's not working? – Brian Jan 18 '23 at 18:16
12

Another option that takes advantage of another of magrittr's pipe-lining features: the tee operator %T>%.

library(ggplot2)
library(magrittr)
# to solidify where the variable will be out-of-scope defined
nr <- "oops"
mtcars %T>%
  { nr <<- nrow(.) } %>%
  ggplot(aes(mpg, hp)) + 
    labs(title=sprintf("N=%i", nr)) + 
  geom_point()

(This can also be done using dplyr's do({nr <<- nrow(.)}) %>%.)

This differs from Brian's answer in two ways:

  1. Subjectively "cleaner looking", in that the ggplot code is not indented within a code block. (As commented, though, the blending of different pipelines could be a negative as well.)

  2. It has side-effect, by creating nr outside of the pipeline and ggplot pipes. By pre-assigning nr, I think this mitigates reaching outside of the local environment, but it's still a little sloppy.

r2evans
  • 141,215
  • 6
  • 77
  • 149
  • I agree with your assessment in (2) that this is a bit sloppy due to its side effects, which is why I prefer Brian's answer. But this still is a very interesting use of the tee operator. – Schmuddi Jul 13 '17 at 18:51
  • Subjectively, I (of course, haha) prefer having the `ggplot` call set off within braces. That makes it a cleaner separation between processing and plotting objects in the chain. – Brian Jul 13 '17 at 18:52
  • Good point, Brian, the hybrid pipelines have always been "visually disturbing". Schmuddi -- agree fully. It would be nice (albeit 100s of hours of work and breaking incompatibility) if `ggplot2` honored `%>%` instead of overloading `+`. – r2evans Jul 13 '17 at 18:54
  • 1
    There was an attempt at reconciling them: https://github.com/thomasp85/pipeplotter e.g. `ggplot(mtcars, aes(mpg, wt)) %>% add_points()` – Brian Jul 13 '17 at 18:55
  • 1
    Also: https://github.com/tidyverse/ggplot2/issues/1954, referring to a [tweet by Hadley](https://twitter.com/hadleywickham/status/648851767254380545) about the pipe and 20/20 hindsight. – r2evans Jul 13 '17 at 19:06