0

I want to create a function to sum two columns together, in which one word connects two columns and there are three sets. I want to pass that word as the only argument, and add a new column with that same name. Something like this.

the_first_x <- c(0,0,10,0)
the_second_x <- c(10,0,10,0)
the_first_y <- c(0,5,5,5)
the_second_y <- c(5,5,0,0)

df <- data.frame(the_first_x,
                 the_second_x,
                 the_first_y,
                 the_second_y)

summing <- function(letter){
  df$letter <- the_first_letter + the_second_letter
}

Such that using the following adds a column with that letter as a name and that sum as its rows

summing(x)
summing(y)

By doing it like this, the letter argument is not recognised, and using something like paste() makes that the argument is surrounded by parentheses and also not recognised.

2 Answers2

2

Here is a better approach. (one should avoid using <<- where possible)

summing <- function(data, letter, pattern = paste0(letter,"$")){
    data[[letter]] <- rowSums(data[,grepl(pattern,names(data),)], na.rm = T)
    return(data)
}

This function is especially handy when working with a pipe:

library(magrittr)
df %>% summing("x") %>% summing("y")

#  the_first_x the_second_x the_first_y the_second_y  x  y
#1           0           10           0            5 10  5
#2           0            0           5            5  0 10
#3          10           10           5            0 20  5
#4           0            0           5            0  0  5

Of course, you can use it without a pipe:

ans <- summing(df, "x")
summing(ans, "y")

The pattern argument takes regular expression. With that, it's very general and specific to which columns you want to add.

Andre Elrico
  • 10,956
  • 6
  • 50
  • 69
  • Thanks, that works! It also works without the pattern: `summing <- function(data, letter){ data[[letter]] <- rowSums(data[,grepl(letter,names(data),)], na.rm = T) return(data) }` – Raoul Van Oosten Sep 17 '18 at 08:44
  • 1
    sure it does with this example. But consider a column-name `xbox`. You would match that as well when you type `"x"` and therefore sum it also. The regex pattern assures you that you match only the cols you intend to add. Be as specific in the pattern as possible. – Andre Elrico Sep 17 '18 at 08:52
2

I would advise against using (1) <<- and (2) deparse(substitute(...)). Concerning (1), this is advice taken from What is the difference between assign() and <<- in R? on what not to use <<- for (addition in bracket is mine):

The Evil and Wrong use [of <<-] is to modify variables in the global environment.

Here is a tidyverse option using some rlang syntax:

library(tidyverse)
my_sum <- function(df, x) {
    x <- enquo(x)
    col <- names(df)[str_detect(names(df), quo_name(x))]
    df %>% mutate(!!x := !!sym(col[1]) + !!sym(col[2]))
}
df %>% my_sum(x)
#  the_first_x the_second_x the_first_y the_second_y  x
#1           0           10           0            5 10
#2           0            0           5            5  0
#3          10           10           5            0 20
#4           0            0           5            0  0

df %>% my_sum(y)
#  the_first_x the_second_x the_first_y the_second_y  y
#1           0           10           0            5  5
#2           0            0           5            5 10
#3          10           10           5            0  5
#4           0            0           5            0  5

We can nicely chain multiple my_sum calls:

df %>% my_sum(x) %>% my_sum(y)
#  the_first_x the_second_x the_first_y the_second_y  x  y
#1           0           10           0            5 10  5
#2           0            0           5            5  0 10
#3          10           10           5            0 20  5
#4           0            0           5            0  0  5
Maurits Evers
  • 49,617
  • 4
  • 47
  • 68
  • whats wrong about `deparse(substitute(...))`. Do you have a reference? – Andre Elrico Sep 14 '18 at 10:44
  • @AndreElrico There are some pitfalls that may lead to unexpected behaviour when using `deparse(substitute(...))`. At least they really confused me in the past. Details and examples can be found in [Pier Lorenzo Paracchini's *Non-Standard Evaluation in R*](https://rpubs.com/pparacch/280365), [Hadley Wickham's *Non-standard evaluation*](https://cran.r-project.org/web/packages/lazyeval/vignettes/lazyeval.html), and the SO post [Non standard evaluation in Hadley's advanced R book](https://stackoverflow.com/questions/25336780/non-standard-evaluation-in-hadleys-advanced-r-book). – Maurits Evers Sep 14 '18 at 10:56
  • You're very welcome @AndreElrico. Those examples from the linked SO post still do my head in...;-) – Maurits Evers Sep 14 '18 at 10:58