6

I'm going over looping with tidyverse and purrr using Hadley's R4DS book and am a little confused as to the exact usage of the tilde ~ symbol and period symbol.

So when writing for loops, or using map(), instead of writing out function(), it appears you can use the tilde symbol instead ~.

Does this only apply to for loops?

so as below...

models <- mtcars %>% 
  split(.$cyl) %>% 
  map(~lm(mpg ~ wt, data = .))

Additionally, the period i was told can be used "to refer to the current list element". But I am confused what that means. Does that mean, that only when looping, the period means it refers to the element in the list that is being looped over? How is it different from piping? When you pipe, you are piping the result of one line to the next line of code.

So in the case above, mtcars is piped to the second line with split() but a period is used. Why?

The case below sums up my confusion:

x <- c(1:10)

detect(x, ~.x > 5)

using the detect function, which finds the first match, I thought i could just use

detect(x, x >5)

but I get an error saying x >5 is not a function. So i add a tilde

detect(x, ~ x > 5)

and get an error sayingt it expects a single TRUE or FALSE, not 10. So if you add a period

detect(x, ~.x >5) 

suddenly it works as looping. So what is the relation/ usage of ~ and . here and how does . compare to simple piping?

Karolis Koncevičius
  • 9,417
  • 9
  • 56
  • 89
Kevin Lee
  • 321
  • 2
  • 8
  • Related: https://stackoverflow.com/questions/53159979/tilde-dot-in-r – Artem Sokolov Jun 20 '20 at 16:59
  • Some related posts: [What is meaning of first tilde in purrr::map](https://stackoverflow.com/questions/44834446/what-is-meaning-of-first-tilde-in-purrrmap); [What does the dplyr period character “.” reference?](https://stackoverflow.com/questions/35272457/what-does-the-dplyr-period-character-reference); [Using the %>% pipe, and dot (.) notation](https://stackoverflow.com/questions/42385010/using-the-pipe-and-dot-notation); [dplyr piping data - difference between `.` and `.x`](https://stackoverflow.com/questions/56532119/dplyr-piping-data-difference-between-and-x) – Henrik Jun 20 '20 at 17:19

1 Answers1

12

This overall is known as tidyverse non-standard evaluation (NSE). You probably found out that ~ also is used in formulas to indicate that the left hand side is dependent on the right hand side.

In tidyverse NSE, ~ indicates function(...). Thus, these two expressions are equivalent.

x %>% detect(function(...) ..1 > 5)
#[1] 6

x %>% detect(~.x > 5)
#[1] 6

~ automatically assigns each argument of the function to the .; .x, .y; and ..1, ..2 ..3 special symbols. Note that only the first argument becomes ..

map2(1, 2, function(x,y) x + y)
#[[1]]
#[1] 3

map2(1, 2, ~.x + .y)
#[[1]]
#[1] 3

map2(1, 2, ~..1 + ..2)
#[[1]]
#[1] 3

map2(1, 2, ~. + ..2)
#[[1]]
#[1] 3

map2(1, 2, ~. + .[2])
#[[1]]
#[1] NA

This automatic assignment can be very helpful when there are many variables.

mtcars %>% pmap_dbl(~ ..1/..4)
# [1] 0.19090909 0.19090909 0.24516129 0.19454545 0.10685714 0.17238095 0.05836735 0.39354839 0.24000000 0.15609756
#[11] 0.14471545 0.09111111 0.09611111 0.08444444 0.05073171 0.04837209 0.06391304 0.49090909 0.58461538 0.52153846
#[21] 0.22164948 0.10333333 0.10133333 0.05428571 0.10971429 0.41363636 0.28571429 0.26902655 0.05984848 0.11257143
#[31] 0.04477612 0.19633028

But in addition to all of the special symbols I noted above, the arguments are also assigned to .... Just like all of R, ... is sort of like a named list of arguments, so you can use it along with with:

mtcars %>% pmap_dbl(~ with(list(...), mpg/hp))
# [1] 0.19090909 0.19090909 0.24516129 0.19454545 0.10685714 0.17238095 0.05836735 0.39354839 0.24000000 0.15609756
#[11] 0.14471545 0.09111111 0.09611111 0.08444444 0.05073171 0.04837209 0.06391304 0.49090909 0.58461538 0.52153846
#[21] 0.22164948 0.10333333 0.10133333 0.05428571 0.10971429 0.41363636 0.28571429 0.26902655 0.05984848 0.11257143
#[31] 0.04477612 0.19633028

An other way to think about why this works is because data.frames are just a list with some row names:

a <- list(a = c(1,2), b = c("A","B"))
a
#$a
#[1] 1 2
#$b
#[1] "A" "B"
attr(a,"row.names") <- as.character(c(1,2))
class(a) <- "data.frame"
a
#  a b
#1 1 A
#2 2 B
Ian Campbell
  • 23,484
  • 14
  • 36
  • 57
  • So are the special symbols, .x, .y, ..1, ..2, ..3 and . just referential symbols? Based on the example, it looks like they are just referencing the values earlier in the code. In map(1,2, ~.x + .y) .x references the first value and .y refeferences the second. And then in the mtcars example, it's almost like its subsetting because ..1 refers to the first column of mtcars. Am i understanding it right? – Kevin Lee Jun 20 '20 at 16:42
  • 1
    You've got it. Another thing to keep in mind is that `.` is being evaluated within the function environment, not within the call to `detect`. That's why `.` refers to the current value of `x` rather than the entire vector. – Ian Campbell Jun 20 '20 at 16:45
  • In the detect() example, `.` refers to the current value of x rather than the entire vector. Is that because within the detect function there is a for loop built in? To follow up, in the examples above, .1 and .2 are "directly" referential in the map(1,2, ~.x + .y) example, but in the mtcars example, they are more subsetting. How do I know when the symbol is being used to subset from the original value vs when it is used to directly refer to the value. I guess in a case where we have map(mtcars, 1, 2, ~ ) how would i reference say the third column in mtcars versus the third value (2)? – Kevin Lee Jun 20 '20 at 17:03
  • 1
    Sorry, to clarify, `.` refers to the value of the first argument within the function. See the evaluation of `(function(x) x > 5)(x)` and `help(detect)`. As for your question about `map(mtcars, 1, 2, ~ )`, this doesn't do what you're expecting, because `map` only accepts a single list or vector to apply over. – Ian Campbell Jun 20 '20 at 17:14
  • I see. One last question! When I evaluate map(mtcars, ~ ..1/2) the result is every column's values divided by 2. while when i use pmap(mtcars, ~ ..1 /2) it is the first column divided by 2. So it looks like in the pmap case, the ..1 subsets by the first column but in the map case it does not. What is going on in the map case? pmap(mtcars, ~ ..1 /2 ) – Kevin Lee Jun 20 '20 at 18:04