1

For functions like lm() in R, you pass the "data" argument into the function, usually a dataframe, and then R knows all of the columns by name rather than referencing them. So the difference being x=column instead of referencing in the fashion x=df$column. So how can I use that same method in my own user defined functions?

A simple example:

library(tidyverse)

df <- tibble(x=1:100,y=x*(1+rnorm(n=100)))

test_corr <- function(x,y) {
  cor(x,y) %>% return()
}

# Right now I would do this
test_corr(df$x,df$y)

# I want to be able to do this
test_corr(data=df, x, y)
agdeal
  • 103
  • 7
  • 2
    Could you make an example of the type of function you have in mind and how you want to call it? – camille Feb 10 '20 at 02:58
  • 1
    It's easier to help you if you include a simple [reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) with sample input and desired output that can be used to test and verify possible solutions. – MrFlick Feb 10 '20 at 05:41
  • @camille thank you for your feedback, I have attached an example of what I am looking for. – agdeal Feb 10 '20 at 15:04
  • @MrFlick thank you for your feedback, I have attached an example of what I am looking for. – agdeal Feb 10 '20 at 15:04

2 Answers2

3

Since you are using tidyverse functions, it would make sense to use tidy evaulation for this type of task. For this function you could do

test_corr <- function(data, x, y) {
  quo( cor({{x}}, {{y}}) ) %>% 
    rlang::eval_tidy(data=data)
}

test_corr(df, x, y)

First we make a quosure to build the expression you want to evaluate and we use the {{ }} (embrace) syntax to insert the variable names you pass in to the function into the expression. We then evaluate that quosure in the context of the data.frame you supply with eval_tidy.

You might also be interested in the tidyselect package vignette where more options are discussed.

MrFlick
  • 195,160
  • 17
  • 277
  • 295
0

You could use reformulate

apply_fun <- function(response, terms, data) {
   lm(reformulate(terms, response), data)
}

apply_fun("mpg", "cyl", mtcars)
#Call:
#lm(formula = reformulate(terms, response), data = data)

#Coefficients:
#(Intercept)          cyl  
#     37.885       -2.876  

apply_fun("mpg", c("cyl", "am"), mtcars)

#Call:
#lm(formula = reformulate(terms, response), data = data)

#Coefficients:
#(Intercept)          cyl           am  
#     34.522       -2.501        2.567  
Ronak Shah
  • 377,200
  • 20
  • 156
  • 213