0

I am using a package in R that fits a specific form of a regression model. However, unlike the base lm() function that permits the x and y to be separate objects, the function that I'm using requires them to be in the same dataframe.

My problem arises because I have a lot of variables that I want to regress on y independently. Therefore, I have a dataframe with 10 predictor variables (x1, x2... x10) and one criterion variable (y), 11 columns in total. I could use a for loop to run ten separate regressions, but I want to avoid it and use the apply function instead. However, if I call apply on my dataframe, in the last step it will regress y on y itself and I want to avoid this. Is there a function similar to apply which I could run and specify thiat I only want it to run 10 times and not 11, or is there another workaround to this problem?

J. Doe
  • 1,544
  • 1
  • 11
  • 26
  • [Fast pairwise simple linear regression between variables in a data frame](https://stackoverflow.com/q/51953709/4891738). Look for function `general_paired_simpleLM` – Zheyuan Li Sep 06 '18 at 21:01
  • 2
    Pass to apply an object without y. Nothing wrong with loops. If you pre-allocate your result object, things will go normally fast. – Roman Luštrik Sep 06 '18 at 21:08

2 Answers2

2

Here's a tidyverse solution:

library( tidyverse )

xx <- c("disp", "hp", "drat", "wt")   # Names of predictor variables
y <- "mpg"                            # Name of response

str_c( y, xx, sep="~" ) %>%
  map( as.formula ) %>%               # Optional (see below)
  map( lm, data = mtcars )

str_c simply builds up formulas as strings (e.g., "mpg~disp"). While lm accepts strings directly, your particular regression model might not. If it requires an actual formula, you can convert strings to formulas using as.formula (Thanks for the suggestion, @J.Doe!). Other than that, simply replace lm with your particular model and mtcars with your data frame.


Here's the same solution using base R without any additional packages:

strs <- paste( y, xx, sep="~" )
strs <- lapply( strs, as.formula )    # Optional
lapply( strs, lm, data=mtcars )
Artem Sokolov
  • 13,196
  • 4
  • 43
  • 74
  • Just `lapply(strs, lm, data=mtcars )` works doesn't it? – thelatemail Sep 06 '18 at 22:45
  • 1
    For `lm` yes (hence the #Optional comment), but it's not 100% clear what function the OP is using. He/she says it already differs from `lm` by requiring the input vectors to be in the same data frame. My guess is that it also takes a formula, rather than a string, that is contextually specific to that data frame input. – Artem Sokolov Sep 06 '18 at 22:48
  • Thank you very much! This solution worked, however, the optional line didn't convert the string expression to an object of a class "formula", but, instead, to an object of a class "call". Therefore, I changed map(rlang::parse_expr) to map(as.formula) and then everything worked. – J. Doe Sep 07 '18 at 20:07
  • 1
    Ah, yes. You're right! `rlang::expr` returns an expression, similar to `parse( text=x )`. You would need to follow it up with `eval` to evaluate the expression, which then produces a formula. Alternatively, your `as.formula` is a more elegant and compact solution! I'll update the answer. – Artem Sokolov Sep 07 '18 at 20:31
0

Using the builtin anscombe data frame having columns x1, x2, x3, x4, y1, y2, y3, y4 suppose we want to regress y1 on each of x1, x2, x3, x4 separately.

First create a character vector of the names of the independent variables, xnames, and the use lapply to run the indicated run_lm over it. That function pastes together the required formula and performs the lm returning an "lm" class object. L, the result, is a list of such objects, one for each regression.

No packages are used.

xnames <- names(anscombe)[1:4]
run_lm <- function(nm) lm(paste("y1 ~", nm), anscombe)
L <- lapply(xnames, run_lm)

Alternately, this shorter version of run_lm would also work with the above lapply but the Call: output line is not as nice:

run_lm <- function(nm) lm(anscombe[c("y1", nm)])
G. Grothendieck
  • 254,981
  • 17
  • 203
  • 341