20

Recently I have found the %$% pipe operator, but I am missing the point regarding its difference with %>% and if it could completely replace it.


Motivation to use %$%

  • The operator %$% could replace %>% in many cases:
mtcars %>% summary()
mtcars %$% summary(.)
mtcars %>% head(10)
mtcars %$% head(.,10)
  • Apparently, %$% is more usable than %>%:
mtcars %>% plot(.$hp, .$mpg) # Does not work
mtcars %$% plot(hp, mpg)     # Works
  • Implicitly fills the built-in data argument:
mtcars %>% lm(mpg ~ hp, data = .)
mtcars %$% lm(mpg ~ hp)
  • Since % and $ are next to each other in the keyboard, inserting %$% is more convenient than inserting %>%.

Documentation

We find the following information in their respective help pages.

(?magrittr::`%>%`):

Description:

     Pipe an object forward into a function or call expression.

Usage:

     lhs %>% rhs

(?magrittr::`%$%`):

Description:

     Expose the names in ‘lhs’ to the ‘rhs’ expression. This is useful
     when functions do not have a built-in data argument.

Usage:

     lhs %$% rhs

I was not able to understand the difference between the two pipe operators. Which is the difference between piping an object and exposing a name? But, in the rhs of %$%, we are able to get the piped object with the ., right?


Should I start using %$% instead of %>%? Which problems could I face doing so?

Gorka
  • 1,971
  • 1
  • 13
  • 28
  • 2
    You can do whatever you like — and in the examples you’ve shown, `%$%` is particularly powerful. But you’ll find that using e.g. ‘dplyr’ or ‘tidyr’ makes `%$%` much less useful than your particular examples, because these packages (and others like it) perform their own name lookup in the context of the LHS. – Konrad Rudolph Feb 06 '22 at 14:29
  • 3
    As for keyboard convenience, I would point out that if you are using RStudio, `%>%` can be inserted with Cmd+Shift+M or Ctrl+Shift+M, depending on your OS. – jdobres Feb 06 '22 at 14:32
  • 2
    Much easiest way is to use build in pipe `|>` as it's package independent. – Grzegorz Sapijaszko Feb 06 '22 at 15:21
  • Interesting, I'm pretty sure that in the past the code of `%$%` used to be `'%$%' <- with`, now that we can use the dot the usage you suggest is not absurd at all IMO. Just unusual since most user use `%>%`. A couple comments, you can do ctrl + shift + M in Studio to save typing, and in your lm example the data arg is NOT filled implicitly, it's missing, but mpg and hp are found as independents object in the local environment so it's not needed. – moodymudskipper Mar 07 '22 at 16:42

2 Answers2

11

In addition to the provided comments:

%$% also called the Exposition pipe vs. %>%:

This is a short summary of this article https://towardsdatascience.com/3-lesser-known-pipe-operators-in-tidyverse-111d3411803a

"The key difference in using %$% or %>% lies in the type of arguments of used functions."

One advantage, and as far as I can understand it, for me the only one to use %$% over %>% is the fact that we can avoid repetitive input of the dataframe name in functions that have no data as an argument.

For example the lm() has a data argument. In this case we can use both %>% and %$% interchangeable.

But in functions like the cor() which has no data argument:

mtcars %>% cor(disp, mpg) # Will give an Error
cor(mtcars$disp, mtcars$mpg)

is equivalent to

mtcars %$% cor(disp, mpg)

And note to use %$% pipe operator you have to load library(magrittr)

Update: on OPs comment: The pipe independent which one allows us to transform machine or computer language to a more readable human language.

ggplot2 is special. ggplot2 is not internally consistent. ggplot1 had a tidier API then ggplot2

Pipes would work with ggplot1: library(ggplot1) mtcars %>% ggplot(list( x= mpg, y = wt)) %>% ggpoint() %>% ggsave("mtcars.pdf", width= 8 height = 6)

In 2016 Wick Hadley said: "ggplot2 newver would have existed if I'd discovered the pipe 10 years earlier!" https://www.youtube.com/watch?v=K-ss_ag2k9E&list=LL&index=9

TarJae
  • 72,363
  • 6
  • 19
  • 66
  • 1
    Can you explain how the *magic* of the data argument works? Why `%$%` is not working in `ggplot`, which also has a `data` argument? Is '%$%' only useful for base R functions? – Gorka Feb 07 '22 at 23:19
  • 1
    Please see my update. I think I could answer the ggplot question. – TarJae Feb 08 '22 at 23:14
11

No, you shouldn't use %$% routinely. It is like using the with() function, i.e. it exposes the component parts of the LHS when evaluating the RHS. But it only works when the value on the left has names like a list or dataframe, so you can't always use it. For example,

library(magrittr)
x <- 1:10
x %>% mean()
#> [1] 5.5
x %$% mean()
#> Error in eval(substitute(expr), data, enclos = parent.frame()): numeric 'envir' arg not of length one

Created on 2022-02-06 by the reprex package (v2.0.1.9000)

You'd get a similar error with x %$% mean(.).

Even when the LHS has names, it doesn't automatically put the . argument in the first position. For example,

mtcars %>% nrow()
#> [1] 32
mtcars %$% nrow()
#> Error in nrow(): argument "x" is missing, with no default

Created on 2022-02-06 by the reprex package (v2.0.1.9000)

In this case mtcars %$% nrow(.) would work, because mtcars has names.

Your example involving .$hp and .$mpg is illustrating one of the oddities of magrittr pipes. Because the . is only used in expressions, not alone as an argument, it is passed as the first argument as well as being passed in those expressions. You can avoid this using braces, e.g.

mtcars %>% {plot(.$hp, .$mpg)}
user2554330
  • 37,248
  • 4
  • 43
  • 90
  • 1
    Good point! `%$%` cannot always be used. Now I can see why the pipe operator `%$%` is represented that way, actually it is quite helpful : `%$%` can only be used with objects in which `$` can be used. – Gorka Feb 07 '22 at 23:18
  • However, when it is possible to use `%$%`, why should it be avoided? Why exposing the components parts is a bad practice? I see it more like a feature: we have first class access for names (i.e. we avoid the `.$`), but if needed, we can also get the parent object with `.`. – Gorka Feb 07 '22 at 23:19
  • 1
    It's not bad practice when you want access to the component parts. The only negative is that it doesn't really fit the "pipe" paradigm, which is that an object is modified multiple times as it passes through the calls. – user2554330 Feb 08 '22 at 10:16
  • Late to this but re: "Since % and $ are next to each other in the keyboard, inserting %$% is more convenient" - one minor argument against this is the inclusion of a base R pipe since R~4.2 which can typically be accessed with CTRL+SHIT+M which is thus quicker and doesn't require the librarying of any packages. – dez93_2000 Jun 09 '23 at 20:42