3

I'm trying to include subsets directly in the geom argument in ggplot, and am trying to understand what seems to be inconsistent behavior.

If I use data = . %>% filter() it works, but if I try to use data = filter(.) I get an error message. Outside of a ggplot flow those two syntaxes are normally interchangeable, so what's going on here?

library(tidyverse)

# piping in, works
ggplot(data = cars, aes(x = speed, y = dist)) +
  geom_point(data = . %>% filter(speed > 10))

# '.' in function, error: "object '.' not found"
ggplot(data = cars, aes(x = speed, y = dist)) +
  geom_point(data = filter(. , speed > 10))
Lief Esbenshade
  • 793
  • 4
  • 13

1 Answers1

3

Here's how I understand what's going on under the hood.

Generally and according to magrittr syntax

. %>% some_function(...)

is short for

function(x) some_function(x, ...)

So in your case,

. %>% filter(speed > 10)

can be expanded to

function(x) filter(x, speed > 10)

We can confirm that indeed

ggplot(data = cars, aes(speed, dist)) + geom_point(data = function(x) filter(x, speed > 10))    

works and gives the same results as for your "piping in, work" example.

So they key here is to recognise that data (inside geom_point) can take a function as an argument. The function is an anonymous function that is applied to the data argument from your main ggplot2 call.

To quote from ggplot2 issue #1486

This PR will make it possible to supply a function as the data argument to the layer function. The function will be applied to the global data of the plot and the result will be used in the layer.

With all of this in mind it then becomes clear why

... + geom_point(data = filter(. , speed > 10))

doesn't work. data in geom_* needs to be either a data.frame or a function returning a data.frame that is applied to the data argument of the main ggplot2 call.

Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
  • But isn't `.` or even `filter(. , speed > 10)` a data.frame here ? – Ronak Shah Nov 12 '19 at 02:26
  • No I don't think so. `. %>% filter(speed > 10)` is short for an anonymous function; that's `magrittr` syntax, see e.g. [What does the magrittr dot/period (“.”) operator do when it's at the very beginning of a pipeline?](https://stackoverflow.com/questions/53436488/what-does-the-magrittr-dot-period-operator-do-when-its-at-the-very-beginn);`ggplot2` itself doesn't know anything about `.` (as the error indicates). – Maurits Evers Nov 12 '19 at 02:31