9

I am wondering if it is somehow possible to access the columns of the provided data within a ggplot2 graph for the title. So something like that:

ggplot(mpg %>% filter(manufacturer == 'audi'), 
  aes(x = hwy, y = displ, label = model)) + 
    geom_point() + 
    geom_text(data = . %>% filter(hwy > 28)) + 
    ggtitle(unique(.$manufacurer))

I often create plots like in the example above to plot only subsets and would love to label automatically for the subset. At the moment, the . will not be recognised within the ggtitle but within geom_text it is working.

EDIT:
Since I got a very nice comment and mark as duplicate from @Brian, is there maybe a solution to use this {} trick within the dplyr::group_by function? This is somehow not working. I would love to create individual plots for each group, but somehow only the complete data frame goes into the ggplot call.

mpg %>%
  group_by(manufacturer) %>% {
    ggplot(., aes(cyl, displ, color=manufacturer)) +
      geom_point() + 
      ggtitle(unique(.$manufacturer))
  }

It says Audi but prints all manufacturers in the single plot.

drmariod
  • 11,106
  • 16
  • 64
  • 110
  • 2
    May be you can assign it to an object i.e. `p <- ggplot(...) + geom_text(...)`; Then extract the data i.e. `p + ggtitle(unique(p$data$manufacturer))` – akrun Apr 03 '18 at 14:03
  • Sounds like a good workaround. Is there any logical reason, why I can't access the data with `.`? – drmariod Apr 03 '18 at 14:06
  • 1
    asked a similar [question](https://stackoverflow.com/questions/44025379/dplyr-and-ggplot-piping-is-not-working-as-expected) recently. But no solution yet. – Roman Apr 03 '18 at 14:08
  • I think the class is different i.e. Layer, LayerInstance, ggproto to extract the data – akrun Apr 03 '18 at 14:10
  • 1
    @akrun seems so `ggplot(mpg %>% filter(manufacturer == 'audi'), aes(x = hwy, y = displ, label = model, title=quo_name(unique(manufacturer)))) + geom_point() + geom_text(data = . %>% filter(hwy > 28)) ` – Roman Apr 03 '18 at 14:14
  • Possible duplicate of [How do I access the data frame that has been passed to ggplot()?](https://stackoverflow.com/questions/45088454/how-do-i-access-the-data-frame-that-has-been-passed-to-ggplot) – Brian Apr 03 '18 at 17:51
  • You wrap the call to `ggplot` and its `+`-children with `{}`: https://stackoverflow.com/a/45088522/3330437 – Brian Apr 03 '18 at 17:52
  • @Brian this kind of solves my problem, but maybe an additional question, why is this not working with `group_by`? I added an example to my question but could also open a new question. Thanks. – drmariod Apr 04 '18 at 07:12
  • Possible duplicate of [Apply a ggplot-function per group with dplyr and set title per group](https://stackoverflow.com/questions/29034863/apply-a-ggplot-function-per-group-with-dplyr-and-set-title-per-group) – Roman Apr 04 '18 at 10:45
  • The `$` operator overrides/ignores grouping. I'm wondering if the dplyr `distinct` and `pull` verbs would help. – Brian Apr 04 '18 at 13:36

3 Answers3

3

I would try the following as it is not possible to pipe outside aes().

ggplot(mpg %>% filter(manufacturer == 'audi'), 
       aes(x = hwy, y = displ, label = model)) + 
  geom_point() + 
  geom_text(data = . %>% filter(hwy > 28)) +
  facet_wrap(~manufacturer)+
  theme(strip.background = element_blank(),
        strip.text = element_text(hjust = 0, size = 14))

enter image description here

The idea is to use a facet with empty strip background. If there are more names or variables one has to create an extra faceting variable using e.g. mutate(gr = "title")

mpg %>% 
  mutate(title="This is my plot") %>% 
ggplot(aes(x = hwy, y = displ, col=manufacturer)) + 
  geom_point() + 
  facet_wrap(~title)+
  theme(strip.background = element_blank(),
        strip.text = element_text(hjust = 0, size = 14))

Edit

As you asked a second question here are two solutions for creating individual plots for each group

# first solution
p <- mpg %>%
  group_by(manufacturer) %>% 
     do(plots= ggplot(., aes(cyl, displ)) +
      geom_point() + 
      ggtitle(unique(.$manufacturer))
   )
p %>% slice(1) %>% .$plots


# second solution
mpg %>%
  nest(-manufacturer) %>%
  mutate(plot = map2(data, manufacturer, ~ggplot(data=.x,aes(cyl, displ))+
           geom_point() +
           ggtitle(.y))) %>% 
  slice(1) %>% .$plot 

Or save the data using

map2(paste0(p$manufacturer, ".pdf"), p$plots, ggsave)
Roman
  • 17,008
  • 3
  • 36
  • 49
  • I used this already as a workaround but never thought of removing the background color. Good idea, actually this would be sufficient, since those graphs are often not for productive reasons. – drmariod Apr 03 '18 at 17:44
1

Here's two ways I've done this using split. You can use split to split your dataframe into a named list of dataframes, based on a variable. So calling split(mpg, .$manufacturer) gives you a list of dataframes, where each dataframe is associated with a manufacturer, e.g. split_df$audi is the dataframe of all observations made by Audi.

library(dplyr)
library(purrr)
library(ggplot2)

split_df <- split(mpg, .$manufacturer)

First way you could do this is to just call ggplot on a single item in the list. Since the list is named, names(split_df)[1] will give you the name, "audi".

ggplot(split_df[[1]], aes(x = hwy, y = displ, label = model)) + 
    geom_point() + 
    geom_text(data = . %>% filter(hwy > 28)) +
    ggtitle(names(split_df)[1])

That's kinda cumbersome, especially if you want plots for multiple manufacturers. When I've done this, I've used the map functions from purrr. What's really cool is imap, which maps over both the list and its names. Here I'm making a list of plots by mapping over the list of dataframes; each plot gets a title from the name of that list item.

plots <- imap(split_df, function(df, manufacturer) {
    ggplot(df, aes(x = hwy, y = displ, label = model)) +
        geom_point() +
        geom_text(data = . %>% filter(hwy > 28)) +
        ggtitle(manufacturer)
})

plots$audi

Then I can pull up a specific item from that list of plots. This is also handy for if you need to use walk to map over them and save every plot or a subset of plots, or if you need to use a grid function to arrange them into output, or really anything else cool that purrr is great for.

camille
  • 16,432
  • 18
  • 38
  • 60
1

Answering the follow-up about grouping by a variable to generate plots. I'm not sure how one would do this, nor what the output of a group_by(foo) %>% ggplot() call would be (generate the plots one by one? store them? make a big plot of all of them stacked?).

One way you might try this is via lapply(). Building on the solution you found to access the data from within the call:

library(dplyr)
library(ggplot2)

mfgs <- mpg %>% pull(manufacturer) %>% unique()
plot_list <- lapply(mfgs, function(mfg) {
  mpg %>%
  filter(manufacturer == mfg) %>% {
    ggplot(., aes(cyl, displ, color=manufacturer)) +
      geom_point() + 
      ggtitle(unique(.$manufacturer))
    }
})

From there, you can access a given plot with plot_list[[i]], or (a) pass a named vector instead of mfgs as defined, or just set the names of the list to the supplied vector of manufacturers, which lets you call plots by name:

names(plot_list) <- mfgs
plot_list[["nissan"]]

Hope that helps.

Hendy
  • 10,182
  • 15
  • 65
  • 71