It's important to first determine what environment in R you're programming in. Are you in dplyr or base R? If in dplyr, then reference the documentation for programming with dplyr, rlang, glue, and this stackoverflow answer. If in base R, reference the documentation on non-standard evaluation, especially wrapping quoted columns in as.character(substitute())
and wrapping functions with unquoted columns in eval(substitute())
.
It should be noted that both of the approaches above involve non-standard evaluation. Another approach is use standard evaluation (or some "combination" of standard evaluation and non-standard evaluation). For example, see the issue raised in this link.
Reasons for this question come, at least partially, from environment confusion. Here are some of the different approaches in a reprex.
Data
my_df <-
data.frame(
matrix(
c(
"V9G","Blue",
NA,"Red",
"J4C","White",
NA,"Brown",
"F7B","Orange",
"G3V","Green"
),
nrow = 6,
ncol = 2,
byrow = TRUE,
dimnames = list(NULL,
c("color_code", "color"))
),
stringsAsFactors = FALSE
)
Packages
library(collapse)
library(dplyr)
library(stringr)
library(glue)
Functional Programming in base R (non-standard evaluation)
with a quoted column name:
my_func <- function(df, col) {
col_char_ref <- as.character(substitute(col)) #Use as.character(substitute()) to refer to a quoted column name
df %>%
collapse::na_omit(cols = col_char_ref)
}
my_func(my_df, color_code)
#Should generate output below
my_df %>%
collapse::na_omit(cols = "color_code")
and with a non-quoted column name:
my_func <- my_func <- function(df, col){
df <- df # This makes sure "df" is available inside the function environment where we evaluate the ftransform expression
eval(substitute(collapse::ftransform(df, count = stringr::str_length(col)))) # Wrap the function to be evaluated in eval(substitute())
}
my_func(my_df, color)
#Should generate output below
my_df %>%
collapse::ftransform(count = stringr::str_length(color))
Functional programming in dplyr (non-standard evaluation)
with a quoted column name using glue and dplyr functions:
my_func <- function(df, col1, col2) {
df %>%
mutate(description := glue("color code: {pull(., {{col1}})}; color: {pull(., {{col2}})}"))
}
my_func(my_df, color_code, color)
#Should generate output below
my_df %>%
mutate(description = glue("color code: {color_code}; color: {color}"))
or with a quoted column name using a C language wrapper function:
my_func <- function(df, col1, col2) {
df %>%
mutate(description := sprintf("color code: %s; color: %s", {{col1}}, {{col2}}))
}
my_func(my_df, color_code, color)
#Should generate output below
my_df %>%
mutate(description = glue("color code: {color_code}; color: {color}"))
and with a non-quoted column name:
my_func <- function(df, col){
df %>%
dplyr::mutate(count = stringr::str_length({{ col }}))
}
my_func(my_df, color)
#Should generate output below
my_df %>%
dplyr::mutate(count = stringr::str_length(color))
Correcting error-producing code
The following code that produces an error provides a motivation for the two examples below:
my_func <- function(df, col){
df <- df
df %>%
collapse::na_omit(cols = as.character(substitute(col))) %>%
eval(substitute(collapse::ftransform(description = stringr::str_length(col))))
}
my_func(my_df, color_code)
#Error in ckmatch(cols, nam) : Unknown columns: col
The examples below are alternatives that do not produce errors.
Functional Programming in base R (standard evaluation - requires column to be passed as character string in function)
library(pkgcond)
my_func <- function(df, col) {
if (!is.character(substitute(col)))
pkgcond::pkg_error("col must be a quoted string") #if users aren't used to quoted strings as inputs to a function
df <- na_omit(df, cols = col)
df$count <- stringr::str_length(.subset2(df, col))
df
}
my_func(my_df, "color_code")
#Should generate output below
my_df %>%
na_omit(cols = "color_code") %>%
ftransform(description = stringr::str_length("color_code"))
Functional Programming in base R ("combination" of standard evaluation and non-standard evaluation)
my_func <- function(df, col){
df <- df
df <- collapse::na_omit(df, cols = as.character(substitute(col))) # Unlike the code with the error, the function is not piped (using %>%)
eval(substitute(collapse::ftransform(df, description = stringr::str_length(col))))
}
my_func(my_df, color_code)
#Should generate output below
my_df %>%
na_omit(cols = "color_code") %>%
ftransform(description = stringr::str_length("color_code"))
More complex examples using the collapse package can be referenced at this link.