15

What is the difference between the "+" operator in ggplot2 and the "%>%" operator in magrittr?

I was told that they are the same, however if we consider the following script.

library(magrittr)
library(ggplot2)

# 1. This works
ggplot(data = mtcars, aes(x=wt, y = mpg)) + geom_point()

# 2. This works
ggplot(data = mtcars) + aes(x=wt, y = mpg) + geom_point()

# 3. This works
ggplot(data = mtcars) + aes(x=wt, y = mpg) %>% geom_point()

# 4. But this doesn't
ggplot(data = mtcars) %>% aes(x=wt, y = mpg) %>% geom_point()
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
Anthony Ebert
  • 675
  • 14
  • 25
  • 2
    Also, tangentially, you don't need all those imports. Including them in your example makes it hard to rule out cross library interference. – Matthew Drury Feb 11 '16 at 05:49
  • 3
    I've edited your question to use built-in data instead of your own data, to eliminate unused packages, and to make the whole thing copy/paste-able. – Gregor Thomas Feb 11 '16 at 07:46

1 Answers1

19

Piping is very different from ggplot2's addition. What the pipe operator, %>%, does is take the result of the left-hand side and put it as the first argument of the function on the right-hand side. For example:

1:10 %>% mean()
# [1] 5.5

Is exactly equivalent to mean(1:10). The pipe is more useful to replace multiply nested functions, e.g.,

x = factor(2008:2012)
x_num = as.numeric(as.character(x))
# could be rewritten to read from left-to-right as
x_num = x %>% as.character() %>% as.numeric()

but this is all explained nicely over at What does %>% mean in R?, you should read through that for a couple more examples.

Using this knowledge, we can re-write your pipe examples as nested functions and see that they still do the same things; but now it (hopefully) is obvious why #4 doesn't work:

# 3. This is acceptable ggplot2 syntax
ggplot(data = mtcars) + geom_point(aes(x=wt, y = mpg))

# 4. This is not
geom_point(aes(ggplot(data = mtcars), x=wt, y = mpg))

ggplot2 includes a special "+" method for ggplot objects, which it uses to add layers to plots. I didn't know until you asked your question that it also works with the aes() function, but apparently that's defined as well. These are all specially defined within ggplot2. The use of + in ggplot2 predates the pipe, and while the usage is similar, the functionality is quite different.

As an interesting side-note, Hadley Wickham (the creator of ggplot2) said that:

...if I'd discovered the pipe earlier, there never would've been a ggplot2, because you could write ggplot graphics as

ggplot(mtcars, aes(wt, mpg)) %>%
  geom_point() %>%
  geom_smooth()
Community
  • 1
  • 1
Gregor Thomas
  • 136,190
  • 20
  • 167
  • 294
  • 3
    i simply overlooked the precedence of %>% over + in my hasty comment, it's not surprising after all – baptiste Feb 11 '16 at 08:28
  • Yeah, they comment I was responding to is gone. Cleaning up some of this now. – Gregor Thomas Apr 07 '20 at 17:51
  • 1
    Wow, that quote from Hadley Wickham is dynamite. It helped explain so much of my confusion. No wonder ggplot2 is so confusing. It's attempting to reinvent the pipe! The plus operator now makes perfect sense: it's ggplot2 specific, and I should stop trying to understand it. It's just how ggplot2 adds layers, nothing more. – Tom Rose Apr 21 '21 at 15:14