1

For a data frame, I would like to create a new column that is the sum of other columns with dplyr's mutate(). These columns to sum over should be dynamically definable. For example, I would like to sum two specified columns from mtcars:

library(dplyr)

columns_to_sum <- c("gear", "carb")

mtcars %>%
  rowwise() %>%
  mutate(newcol = sum(columns_to_sum))

but this results (expectedly) in error

invalid 'type' (character) of argument

How should I change the last line of code so that the sum over the requested columns is taken?

Darren Tsai
  • 32,117
  • 5
  • 21
  • 51
MartijnVanAttekum
  • 1,405
  • 12
  • 20

1 Answers1

1

It's relevant to this SO question: Sum across multiple columns with dplyr. You can use tidy-selection helpers all_of() & any_of() to select variables from a character vector.

library(dplyr)

columns_to_sum <- c("gear", "carb")

mtcars %>%
  mutate(newcol = rowSums(pick(all_of(columns_to_sum))))

You can also use rowwise + c_across, but it's slower if data size is large.

mtcars %>%
  rowwise() %>%
  mutate(newcol = sum(c_across(all_of(columns_to_sum)))) %>%
  ungroup()
Darren Tsai
  • 32,117
  • 5
  • 21
  • 51