7

I have been searching this and have found this link to be helpful with renaming passed columns from a function (the [,column_name] code actually made my_function1 work after I had been searching for a while. Is there a way to use the pipe operator to rename columns in a dataframe within a function?

My attempt is shown in my_function2 but it gives me an Error: All arguments to rename must be named or Error: Unknown variables: col2. I am guessing because I have not specified what col2 belongs to.

Also, is there a way to pass associated arguments into the function, like col1 and new_col1 so that you can associated the column name to be replaced and the column name that is replacing it. Thanks in advance!

library(dplyr)

my_df = data.frame(a = c(1,2,3), b = c(4,5,6), c = c(7,8,9))

my_function1 = function(input_df, col1, new_col1) {
  df_new = input_df
  df_new[,new_col1] = df_new[,col1]
  return(df_new)
}
temp1 = my_function1(my_df, "a", "new_a")

my_function2 = function(input_df, col2, new_col2) {
  df_new = input_df %>%
    rename(new_col2 = col2)
  return(df_new)
}

temp2 = my_function2(my_df, "b", "new_b")
Community
  • 1
  • 1
Prevost
  • 677
  • 5
  • 20
  • 1
    Have a look at `rename_` in the `?rename` help file. This seems to work: `my_function3 = function(input_df, cols, new_cols) { rename_(input_df, .dots = setNames(as.list(cols), new_cols)) }` You might find this link helpful: http://stackoverflow.com/q/30382908/1191259 – Frank Jan 26 '16 at 20:37
  • @Frank That's works! I need to investigate into what those arguments actually mean and why that works. If you post this as an answer (and please expand with an explanation as it would be greatly appreciated!) I will mark it as an answer. – Prevost Jan 26 '16 at 20:57
  • @Frank and I'm assuming that the same methodology works for `select_` as that is what came up in the `?rename`. I am also going to mutate the columns (which wasn't in the original question as I wanted to keep it to one question) but I think I may be missing something fundamental about how to work with passed column variables in a dataframe? Thanks. – Prevost Jan 26 '16 at 21:08
  • Generally, a data.frame is easier to work with in base R than in dplyr. I'm writing up what I know, but the syntax gets messy very quickly when programming with that package. – Frank Jan 26 '16 at 21:11

2 Answers2

9

rename_ (alongside other dyplyr verbs suffixed with an underscore) has been depreciated. Instead, try:

my_function3 = function(input_df, cols, new_cols) { 
  input_df %>%
    rename({{ new_cols }} := {{ cols }}) 
}

See this vignette for more information about embracing arguments with double braces and programming with dplyr.

4redwood
  • 365
  • 2
  • 13
  • This is currently the most appropriate way to accomplish this. There is an extra parenthesis at the end of your rename function though. – P__2 Aug 11 '21 at 17:09
3

Following @MatthewPlourde's answer to a similar question, we can do:

my_function3 = function(input_df, cols, new_cols) { 
  rename_(input_df, .dots = setNames(cols, new_cols)) 
}

# example
my_function3(my_df, "b", "new_b")
#   a new_b c
# 1 1     4 7
# 2 2     5 8
# 3 3     6 9

Many dplyr functions have less-known variants with names ending in _. that allow you to work with the package more programmatically. One pattern is...

DF %>% dplyr_fun(arg1 = val1, arg2 = val2, ...)
# becomes
DF %>% dplyr_fun_(.dots = list(arg1 = "val1", arg2 = "val2", ...))

This has worked for me in a few cases, where the val* are just column names. There are more complicated patterns and techniques, covered in the document that pops up when you type vignette("nse"), but I do not know them well.

Community
  • 1
  • 1
Frank
  • 66,179
  • 8
  • 96
  • 180
  • Unsolicited advice: try the data.table package, which has a more natural syntax for this: `setnames(DT, newnames)` or `setnames(DT,oldnames,newnames)`. There is a learning curve to it, but I find it much easier to work with. The intro materials are here: https://github.com/Rdatatable/data.table/wiki/Getting-started – Frank Jan 26 '16 at 21:16