It's better to write functions which take character values for choosing columns. In this case, your function can be rewritten as:
mf <- function(data, y){
output <- data[, boxplot.stats(get(y))['out'], by = .(location)]
setnames(output, 'out', y)
return(output)
}
By using [
to subset the output of boxplot.stats
, a named list with one element ('out'
) is returned. So output
will have two columns: location
and out
. Then you just need to change out
to be whatever was given for y
.
Example:
set.seed(100)
data1 <- data.table(
location = state.name,
hours = rpois(1000, 12)
)
mf(data = data1, y = 'hours')
# location hours
# 1: Delaware 25
# 2: Georgia 21
# 3: Idaho 4
# 4: Massachusetts 5
# 5: Missouri 7
# 6: South Carolina 5
# 7: South Carolina 6
# 8: South Dakota 20
# 9: Texas 5
# 10: Utah 22
Non-standard evaluation is tricky and only worth the effort if you can get something out of it. data.table
uses it for optimization behind the scenes. tidyverse
packages use it to allow in-database processing. If there's no benefit (besides not having to type a few quotation marks), there's only a cost.