6

When I mutate across data, the columns selected by .cols are replaced by the results of the mutation. How can I perform this operation whilst:

  • Keeping the columns selected by .cols in the output
  • Appropriately & automatically renaming the columns created by mutate?

For example:

require(dplyr)
#> Loading required package: dplyr
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
require(magrittr)
#> Loading required package: magrittr
set.seed(7337)

## Create arbitrary tibble
myTibble <- tibble(x = 1:10,
                   y = runif(10),
                   z = y * pi)

## I can mutate across these columns
mutate(myTibble, across(everything(), multiply_by, 2))
#> # A tibble: 10 x 3
#>        x       y      z
#>    <dbl>   <dbl>  <dbl>
#>  1     2 1.78    5.58  
#>  2     4 0.658   2.07  
#>  3     6 0.105   0.331 
#>  4     8 1.75    5.50  
#>  5    10 1.33    4.19  
#>  6    12 1.02    3.20  
#>  7    14 1.20    3.75  
#>  8    16 0.00794 0.0250
#>  9    18 0.108   0.340 
#> 10    20 1.74    5.45

## I can subsequently rename these columns
mutate(myTibble, across(everything(), multiply_by, 2)) %>% 
  rename_with(paste0, everything(), "_double")
#> # A tibble: 10 x 3
#>    x_double y_double z_double
#>       <dbl>    <dbl>    <dbl>
#>  1        2  1.78      5.58  
#>  2        4  0.658     2.07  
#>  3        6  0.105     0.331 
#>  4        8  1.75      5.50  
#>  5       10  1.33      4.19  
#>  6       12  1.02      3.20  
#>  7       14  1.20      3.75  
#>  8       16  0.00794   0.0250
#>  9       18  0.108     0.340 
#> 10       20  1.74      5.45

## But how can I achieve this (without the fuss of creating & joining an additional table):

# A tibble: 10 x 6
    # x      y     z           x_double y_double z_double
# <int>  <dbl> <dbl>        <dbl>    <dbl>    <dbl>
#   1     1 0.313  0.982      2    0.625    1.96 
  # 2     2 0.759  2.39       4    1.52     4.77 
  # 3     3 0.705  2.22       6    1.41     4.43 
  # 4     4 0.573  1.80       8    1.15     3.60 
  # 5     5 0.599  1.88      10    1.20     3.77 
  # 6     6 0.0548 0.172     12    0.110    0.344
  # 7     7 0.571  1.80      14    1.14     3.59 
  # 8     8 0.621  1.95      16    1.24     3.90 
  # 9     9 0.709  2.23      18    1.42     4.46 
  # 10    10 0.954  3.00     20    1.91     5.99 

Created on 2021-09-16 by the reprex package (v2.0.1)

Captain Hat
  • 2,444
  • 1
  • 14
  • 31
  • 1
    There are examples of this laid out in the [`dplyr::across` docs](https://dplyr.tidyverse.org/reference/across.html) – camille Sep 16 '21 at 15:21
  • 1
    Does this answer your question? [Create new variables with mutate\_at while keeping the original ones](https://stackoverflow.com/questions/45947787/create-new-variables-with-mutate-at-while-keeping-the-original-ones) – camille Sep 16 '21 at 15:32
  • @camille not really - I was looking for something I could use with `across`, because `mutate_at` is superseded. I've self-answered because I couldn't find what I wanted here, but then worked it out from the docs. – Captain Hat Sep 16 '21 at 15:49
  • 1
    The acceped answer does that. It was updated after the question was asked to replace `mutate_at` – camille Sep 16 '21 at 17:11
  • @camille thank you for pointing that out. I don't think this is a dupe - the updated answer is the answer to this question, but the questions are sufficiently different that I couldn't find that answer, and likely wouldn't have clicked on it had I seen it – Captain Hat Sep 16 '21 at 19:42
  • I voted to close this down, it is a clear dupe, as camille pointed out – GuedesBF Sep 16 '21 at 22:07
  • Okay. Perhaps it is a dupe. I'm not going to close it because I'm not sure. I can see that tying the two questions together might be useful, but I feel like this question supersedes the older one, which is an _old_ question with a _new_ answer. – Captain Hat Sep 16 '21 at 22:44

1 Answers1

8

Use the .names argument of across

across names its outputs using the argument .names, which is an argument passed to glue::glue(). This is a string in which "{.col}" and "{.fn}" are replaced by the names of your columns (specified by .cols) and functions (specified by .fns)

The default value for .names is NULL, which is equivalent to "{.col}". This means that every mutated column is assigned the same name its counterpart in .cols, which effectively 'overwrites' these columns in the output.

To produce your desired table you would need to do:

require(dplyr)
#> Loading required package: dplyr
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
require(magrittr)
#> Loading required package: magrittr
set.seed(7337)

## Create arbitrary tibble
myTibble <- tibble(x = 1:10,
                   y = runif(10),
                   z = y * pi)

mutate(myTibble, across(everything(), multiply_by, 2, .names = "{.col}_double"))
#> # A tibble: 10 x 6
#>        x       y      z x_double y_double z_double
#>    <int>   <dbl>  <dbl>    <dbl>    <dbl>    <dbl>
#>  1     1 0.889   2.79          2  1.78      5.58  
#>  2     2 0.329   1.03          4  0.658     2.07  
#>  3     3 0.0527  0.165         6  0.105     0.331 
#>  4     4 0.875   2.75          8  1.75      5.50  
#>  5     5 0.666   2.09         10  1.33      4.19  
#>  6     6 0.509   1.60         12  1.02      3.20  
#>  7     7 0.598   1.88         14  1.20      3.75  
#>  8     8 0.00397 0.0125       16  0.00794   0.0250
#>  9     9 0.0541  0.170        18  0.108     0.340 
#> 10    10 0.868   2.73         20  1.74      5.45

Created on 2021-09-16 by the reprex package (v2.0.1)

In this way, you can use across with .fns and .names to do quite a lot:

mutate(myTibble, across(everything(),
                        .fns = list(double = multiply_by, half = divide_by),
                        2,
                        .names = "{.col}_{.fn}"))
#> # A tibble: 10 x 9
#>        x       y      z x_double x_half y_double  y_half z_double  z_half
#>    <int>   <dbl>  <dbl>    <dbl>  <dbl>    <dbl>   <dbl>    <dbl>   <dbl>
#>  1     1 0.889   2.79          2    0.5  1.78    0.444     5.58   1.40   
#>  2     2 0.329   1.03          4    1    0.658   0.165     2.07   0.517  
#>  3     3 0.0527  0.165         6    1.5  0.105   0.0263    0.331  0.0827 
#>  4     4 0.875   2.75          8    2    1.75    0.437     5.50   1.37   
#>  5     5 0.666   2.09         10    2.5  1.33    0.333     4.19   1.05   
#>  6     6 0.509   1.60         12    3    1.02    0.255     3.20   0.800  
#>  7     7 0.598   1.88         14    3.5  1.20    0.299     3.75   0.939  
#>  8     8 0.00397 0.0125       16    4    0.00794 0.00199   0.0250 0.00624
#>  9     9 0.0541  0.170        18    4.5  0.108   0.0271    0.340  0.0850 
#> 10    10 0.868   2.73         20    5    1.74    0.434     5.45   1.36
Captain Hat
  • 2,444
  • 1
  • 14
  • 31