i have a vector of 6x1 and a matrix 6X1000. I want to do linear regression on each column using the same vector. I need this algorithm to go through my matrix and extract out each R^2 value from each correlation/regression. Can anyone help with this? Thanks!
Asked
Active
Viewed 121 times
1
-
4Asim, Can you supply a reproducable cut-down version of the problem? Check out https://stackoverflow.com/questions/5963269 . If you can give us some sample data, (say, a 6x10 matrix) and show what the answer would look like, it would help us help you. – David T May 18 '20 at 02:54
-
1Does the USAGE example in https://purrr.tidyverse.org/ help? – David T May 18 '20 at 03:14
-
For this, just `sapply` and `cor` should work. – yarnabrina May 18 '20 at 03:30
1 Answers
0
You need provide data, e.g.
set.seed(42)
x <- matrix(rnorm(6), 6, 1)
y <- matrix(rnorm(6*10), 6, 10) # Just 10 columns to demonstrate
The first solution uses x as the independent variable and each column of y as the dependent variable in a linear regression and then extracts the R2 value. The second solution just computes the squared correlation coefficient since this is the same for a bivariate regression:
corrs <- sapply(1:10, function(i) summary(lm(y[, i]~x))$r.squared)
# [1] 0.039143014 0.003056088 0.897015721 0.282917356 0.019288198 0.001808288 0.055232746 0.276741234 0.008821625 0.073663713
sapply(1:10, function(i) cor(x, y[, i])^2)
# [1] 0.039143014 0.003056088 0.897015721 0.282917356 0.019288198 0.001808288 0.055232746 0.276741234 0.008821625 0.073663713
names(corrs) <- 1:10 # Label the columns
corrs[which(corrs > .25)]
# 3 4 8
# 0.8970157 0.2829174 0.2767412

dcarlson
- 10,936
- 2
- 15
- 18
-
this was great help. if I want to call the values greater than X and also the column names associated with them how can I do that? – asim May 18 '20 at 05:23
-
I've added that to the answer. Just label the columns of vector of correlations (`corrs`) and then use `which` to get the value and the column number/name. – dcarlson May 18 '20 at 20:25