4

I’d like to dynamically assign which columns to subtract from each other. I’ve read around and looks like I need to use all_of, and maybe across (How to subtract one column from multiple columns in a dataframe in R using dplyr, How to you use objects in dplyr filter?). I can get it working for one variable in a mutate phrase (e.g. mutate(y = all_of(x))), but I can’t seem to do even simple calculations using two. Here’s a simplified example of what I want to do:

var1 <- c("Sepal.Length")
var2 <- c("Sepal.Width")

result <- iris %>%
  mutate(calculation = all_of(var1) - all_of(var2))
Mike
  • 921
  • 7
  • 26

2 Answers2

5

We may use .data to subset the column as a vector. The all_of/any_of are used along with across to loop across the columns

library(dplyr)
iris %>%
  mutate(calculation = .data[[var1]] - .data[[var2]])%>%
  head

-output

Sepal.Length Sepal.Width Petal.Length Petal.Width Species calculation
1          5.1         3.5          1.4         0.2  setosa         1.6
2          4.9         3.0          1.4         0.2  setosa         1.9
3          4.7         3.2          1.3         0.2  setosa         1.5
4          4.6         3.1          1.5         0.2  setosa         1.5
5          5.0         3.6          1.4         0.2  setosa         1.4
6          5.4         3.9          1.7         0.4  setosa         1.5

Or may also use cur_data()

iris %>%
    head %>%
    mutate(calculation = cur_data()[[var1]] - cur_data()[[var2]])

-output

 Sepal.Length Sepal.Width Petal.Length Petal.Width Species calculation
1          5.1         3.5          1.4         0.2  setosa         1.6
2          4.9         3.0          1.4         0.2  setosa         1.9
3          4.7         3.2          1.3         0.2  setosa         1.5
4          4.6         3.1          1.5         0.2  setosa         1.5
5          5.0         3.6          1.4         0.2  setosa         1.4
6          5.4         3.9          1.7         0.4  setosa         1.5

Or another option is to pass both the variables in across, and then reduce with -

library(purrr)
iris %>%
   head %>%
   mutate(calculation = reduce(across(all_of(c(var1, var2))), `-`))

-output

Sepal.Length Sepal.Width Petal.Length Petal.Width Species calculation
1          5.1         3.5          1.4         0.2  setosa         1.6
2          4.9         3.0          1.4         0.2  setosa         1.9
3          4.7         3.2          1.3         0.2  setosa         1.5
4          4.6         3.1          1.5         0.2  setosa         1.5
5          5.0         3.6          1.4         0.2  setosa         1.4
6          5.4         3.9          1.7         0.4  setosa         1.5

Or could convert to symbol and evaluate (!!)

iris %>% 
   head %>% 
   mutate(calculation = !! rlang::sym(var1) - !! rlang::sym(var2))
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species calculation
1          5.1         3.5          1.4         0.2  setosa         1.6
2          4.9         3.0          1.4         0.2  setosa         1.9
3          4.7         3.2          1.3         0.2  setosa         1.5
4          4.6         3.1          1.5         0.2  setosa         1.5
5          5.0         3.6          1.4         0.2  setosa         1.4
6          5.4         3.9          1.7         0.4  setosa         1.5

Or if we want to use all_of in across, just subset the column with [[

iris %>% 
   head %>% 
   mutate(calculation = across(all_of(var1))[[1]] - 
                         across(all_of(var2))[[1]])
Sepal.Length Sepal.Width Petal.Length Petal.Width Species calculation
1          5.1         3.5          1.4         0.2  setosa         1.6
2          4.9         3.0          1.4         0.2  setosa         1.9
3          4.7         3.2          1.3         0.2  setosa         1.5
4          4.6         3.1          1.5         0.2  setosa         1.5
5          5.0         3.6          1.4         0.2  setosa         1.4
6          5.4         3.9          1.7         0.4  setosa         1.5

The reason we need to subset is because, across by default will update the original column when the .names is not present. The calculation will be a data.frame with a single column

out <- iris %>%
    head %>% 
    mutate(calculation = across(all_of(var1)) -
           across(all_of(var2)))
out
  Sepal.Length Sepal.Width Petal.Length Petal.Width Species Sepal.Length
1          5.1         3.5          1.4         0.2  setosa          1.6
2          4.9         3.0          1.4         0.2  setosa          1.9
3          4.7         3.2          1.3         0.2  setosa          1.5
4          4.6         3.1          1.5         0.2  setosa          1.5
5          5.0         3.6          1.4         0.2  setosa          1.4
6          5.4         3.9          1.7         0.4  setosa          1.5

str(out)
data.frame':    6 obs. of  6 variables:
 $ Sepal.Length: num  5.1 4.9 4.7 4.6 5 5.4
 $ Sepal.Width : num  3.5 3 3.2 3.1 3.6 3.9
 $ Petal.Length: num  1.4 1.4 1.3 1.5 1.4 1.7
 $ Petal.Width : num  0.2 0.2 0.2 0.2 0.2 0.4
 $ Species     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1
 $ calculation :'data.frame':   6 obs. of  1 variable:
  ..$ Sepal.Length: num  1.6 1.9 1.5 1.5 1.4 1.5
akrun
  • 874,273
  • 37
  • 540
  • 662
  • I don´t understand. I tried `library(rlang) iris %>% mutate(calculation = !!enquo(var1) - !!enquo(var2))` but it does not work – TarJae Sep 12 '21 at 20:29
  • 1
    @TarJae 'var1', 'var2' are string objects and `en` prefix is used wihtin a function. `enquo` is used mostly when you are trying to convert to quosure by passing a unquoted variable in a function. – akrun Sep 12 '21 at 20:31
  • 1
    The .data option works perfectly, thanks! – Mike Sep 12 '21 at 20:43
4

We could use get to access the variable values where the name of variable is stored in a string (thanks to akrun for assist):

iris %>% 
    mutate(calculation = get(var1) - get(var2)) 
   Sepal.Length Sepal.Width Petal.Length Petal.Width Species calculation
          <dbl>       <dbl>        <dbl>       <dbl> <fct>         <dbl>
 1          5.1         3.5          1.4         0.2 setosa          1.6
 2          4.9         3            1.4         0.2 setosa          1.9
 3          4.7         3.2          1.3         0.2 setosa          1.5
 4          4.6         3.1          1.5         0.2 setosa          1.5
 5          5           3.6          1.4         0.2 setosa          1.4
 6          5.4         3.9          1.7         0.4 setosa          1.5
 7          4.6         3.4          1.4         0.3 setosa          1.2
 8          5           3.4          1.5         0.2 setosa          1.6
 9          4.4         2.9          1.4         0.2 setosa          1.5
10          4.9         3.1          1.5         0.1 setosa          1.8
# ... with 140 more rows
TarJae
  • 72,363
  • 6
  • 19
  • 66