4

I am relatively new to R and I have been facing issues using dplyr inside functions. I have scrounged the forum, looked at all similar issues but I am unable to resolve my issue. I have tried to simplify my issue with the following example

df <- tibble(
  g1 = c(1, 2, 3, 4, 5),
  a = sample(5),
  b = sample(5)
)

I want to write a function to calculate the sum of a and b as follows:

sum <- function(df, group_var, a, b) {
group_var <- enquo(group_var)
 a <- enquo(a)
 b <- enquo(b)

 df.temp<- df %>%
   group_by(g1) %>%
   mutate(
      sum = !!a + !!b
   )

  return(df.temp)
 }

and I can call the function thru this line:

df2 <- sum(df, g1, a, b)

My issue is that I do not want to hard code the columns names in function call since the columns names "g1", "a" and "b" are likely to change. and hence, I have the columns names assigned from a config file (config.yml) to a variable.

But when I use the variables, I run into multiple issues. Can someone guide me here please? For all column name references, I would ideally like to use variables. for e.g. I run into issues here in this code:

A.Key <- "a"
B.Key <- "b"
df2 <- sum(df, g1, A.Key, B.Key)

Thanks in advance and sorry if it has been answered before; I could not find it.

ruser
  • 199
  • 12

1 Answers1

3
sum1 <- function(df, group_var,x,y) {

  group_var <- enquo(group_var)

  x = as.name(x)
  y = as.name(y)

  df.temp<- df %>%
    group_by(!!group_var) %>%
    mutate(
      sum = !!enquo(x)+!!enquo(y)
    )

  return(df.temp)
}

sum1(df, g1, A.Key, B.Key)
# A tibble: 5 x 4
# Groups:   g1 [5]
     g1     a     b   sum
  <dbl> <int> <int> <int>
1    1.     3     2     5
2    2.     2     1     3
3    3.     1     3     4
4    4.     4     4     8
5    5.     5     5    10
Onyambu
  • 67,392
  • 3
  • 24
  • 53