-1

I have 677 dependent and 1 independent variable as follows. I want to run a regression of each 677 column on the independent variable. Its been asked previously but it doesn't work in my case. In addition, I would like to collect the coefficients in one vector and later to regress back to other variables.

'data.frame':   240 obs. of  678 variables:
 $ X1998.01.12  : num  -0.0006958 -0.0019206 -0.0025667 -0.0031404 -0.0000429 ...
 $ X1998.02.12  : num  0.0032112 -0.0002508 0.0010668 -0.0000417 0.0036056 ..

I run the following code:

pred = df[,c(1:677)]; 
pred=as.matrix(pred)
y=df[,c(678)]
my_lms <- lapply(de, function(x) lm(pred~y))

However, I am having an error:

Error in model.frame.default(formula = pred ~ y, drop.unused.levels = TRUE) :
invalid type (list) for variable 'pred'

Any help is appreciated!

NB: Added below after comments.

list_out <- lapply(colnames(de)[1:677], function(i)
tidy(lm(as.formula(paste(x ~ de$X678,i)), data = de)))

Error message is

Error in parse(text = x, keep.source = FALSE) : 
  <text>:2:3: unexpected symbol
1: ~ X1
2: x X1
     ^ 
Brian Tompsett - 汤莱恩
  • 5,753
  • 72
  • 57
  • 129
Enrico Dace
  • 115
  • 10
  • 1
    This should give the expected result: `my_lms <- lapply(df, function(x) lm(x~y))` – kath May 12 '18 at 22:57
  • Are you sure it wasn't just a misspelling of `df`? And since simple linear regression is statsitically equivalent to paired t-test, you shluld be able to find duplicates that will answer this. I would hope the duplicates you do find will also have warnings about the statistical pitfalls in such an effort. – IRTFM May 12 '18 at 23:27
  • @42 de is the file name. Sorry for the bugs. Now is solved but i stuck on how to collect coefficient estimates in one column so that to use it for a later regression. Any idea? Just tried all over the comments but not successful. – Enrico Dace May 13 '18 at 00:02

1 Answers1

2

You don't need to convert the pred to a matrix. It is easier as a data.frame. You can see both approaches below.

# data.frame
pred <- df[, c(1:677)]
y <- df[, c(678)]
my_lms <- lapply(pred, function(x) lm(x ~ y))

# matrix
pred <- as.matrix(pred)
my_lms <- lapply(1 : ncol(pred), function(x) lm(pred[, x] ~ y))

Also, check if you have the correct dependent and independent variable.

Edit for tydying

library(broom)
my_lms <- lapply(1 : ncol(pred), function(x) tidy(lm(pred[, x] ~ y)))
my_df <- do.call(rbind, my_lms)
kangaroo_cliff
  • 6,067
  • 3
  • 29
  • 42
  • I'm guess that the last sentence was prompted by teh ratehr odd request that suggests a reversal of the usual notion of independent being predictors (on the RHS of a formula) and dependent variables (outcomes on the LHS of a formula.) – IRTFM May 12 '18 at 23:32
  • Yes. Now I see OP has mentioned about at the beginning. Maybe I should remove the last line. – kangaroo_cliff May 12 '18 at 23:35
  • Very great! It works. How about collecting the estimates, coefficients, and adjR2. I want to stack them in a matrix format. sapply(my_lms, coef) tried it but many list. i want it in a simpler form. @Su – Enrico Dace May 12 '18 at 23:37
  • @balsano see if [this](https://stackoverflow.com/questions/49932772/running-several-simple-regression-in-r/49933444?noredirect=1#comment86997180_49933444) helps. – kangaroo_cliff May 12 '18 at 23:38
  • @Suren It doesn't work in my case. Said > library(broom) > tidy(my_lms) Error in tidy.list(my_lms) : No tidying method recognized for this list – Enrico Dace May 13 '18 at 00:00
  • @balsano see where `tidy` is called in that answer... it was inside the `lapply`. – kangaroo_cliff May 13 '18 at 00:04
  • @Suren Mentioned the error message in the question post. it has error Error in parse(text = x, keep.source = FALSE) : :2:3: unexpected symbol 1: ~ X1 2: x X1 ^ – Enrico Dace May 13 '18 at 00:15
  • @balsano it should be something like `my_lms <- lapply(1 : ncol(pred), function(x) tidy(lm(pred[, x] ~ y)))`. See the edited answer. – kangaroo_cliff May 13 '18 at 00:19