We can use {dplyover} to solve this kind of problems.
Disclaimer: I'm the maintainer and the package is not on CRAN.
We have two options:
The easiest way is to use dplyover::across2()
. Below I use dplyr::transmute()
to only show the newly created columns, but we can use mutate()
instead to add the new columns to our data.frame
.
across2()
lets you specify two sets of columns to loop over. Here we choose all columns that starts_with("n_")
and all columns that starts_with("score_")
. We can then use .x
(for the former) and .y
(for the latter) in the .fns
argument. In the .names
argument we can specify how our new names should look like. We take the name of the first column {xcol}
and add _success
to it.
library(dplyover) # https://timteafan.github.io/dplyover/
test %>%
transmute(
across2(starts_with("n_"),
starts_with("score_"),
~ .x * .y,
.names = "{xcol}_success")
)
#> # A tibble: 4 × 3
#> n_sci_success n_math_success n_hist_success
#> <dbl> <dbl> <dbl>
#> 1 10 50 5
#> 2 18 30 25
#> 3 22.5 28 27
#> 4 28 24 18
While this approach is easy and straightforward there is one problem: it assumes that the columns are in the correct order. This is also an assumption of the other two answers. If we have a large data.frame
and are not sure if really all columns are in the correct order, dplyover::over()
is the better and programmatically safe option.
Here we loop over a string and use this to construct the variable names. Within over()
we use cut_names("^.*_")
to get the stems of the variable names, in our example c("sci", "math", "hist")
. Then in the function in .fns
we construct the variable names by wrapping a string inside .()
(to evaluate it as a variable name). Within the string we can use {x}
to access the string of the current iteration. This approach will always combine n_sci
with score_sci
even if the columns are not ordered correctly. Finally, here too we can create nice names on the fly in the .names
argument.
test %>%
transmute(
over(cut_names("^.*_"), # <- gets us c("sci", "math", "hist")
~ .("n_{.x}") * .("score_{.x}"),
.names = "n_{x}_success"
)
)
#> # A tibble: 4 × 3
#> n_sci_success n_math_success n_hist_success
#> <dbl> <dbl> <dbl>
#> 1 10 50 5
#> 2 18 30 25
#> 3 22.5 28 27
#> 4 28 24 18
Data from OP
library(tidyverse)
test <- tibble(id = c(1:4),
n_sci = c(10, 20, 30, 40),
score_sci = c(1, .9, .75, .7),
loc_sci = c(1, 2, 3, 4),
n_math = c(100, 50, 40, 30),
score_math = c(.5, .6, .7, .8),
loc_math = c(4, 3, 2, 1),
n_hist = c(10, 50, 30, 20),
score_hist = c(.5, .5, .9, .9),
loc_hist = c(2, 1, 4, 3))
Created on 2023-02-12 with reprex v2.0.2