I am having a problem with building lm function based on many independent variables in for loop. 14 different independent variables (x1, x2, x3 ..., x14) are created in each for loop and as a result the name of the variables (strings) are saved in vector 'independent_variables'. For dependent variable y1, I would like to build the lm function lm(y1 ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 + x10 + x11 + x12 + x13 + x14)
I've tried to paste the elements of the list and typed it in lm function, but it doesn't seem to recognize this as a formula.
for (j in 1:length(num)) {
nam <- paste("x",j, sep="")
assign(nam, vec)
independent_variables <- c(independent_variables, nam)
}
> independent_variables
[1] "x1" "x2" "x3" "x4" "x5" "x6" "x7" "x8" "x9" "x10" "x11"
"x12" "x13" "x14"
they are independent variables of the linear regression function and each element a matrix which has 318 rows in 1 column. Also, for the dependent variable y1, I have a matrix which has the same dimension.
> x1 COAD_65 ACCx_025FE5F8_885E_433D_9018_7AE322A92285_X034_S09 -0.368827920 ACCx_2A5AE757_20D5_49B6_95FF_CAE08E8197A0_X012_S05 -0.418133754 ACCx_3D0CD3BD_3960_46FB_92C3_777F11CCD0FC_X011_S06 -0.885246719 ACCx_4D0D43F5_D8F0_4735_92D5_F40E321C7A05_X010_S09 -0.908954868 ACCx_81A262BD_3078_4BDB_8EB1_30DD6D7948C3_X027_S03 -0.284544506 ACCx_B6E6F014_A599_4A58_A7A5_1F748471D662_X013_S12 -0.991800815 ACCx_B901534B_5E93_475A_91E7_B2DB7DFE98A5_X020_S02 -0.538162178 ACCx_EDEB779F_A603_479D_AAFE_428BC7E4B8DB_X038_S03 -0.462774125 ... UCEC_BDFE8123_081E_49AF_930B_2371D8DEC261_X030_S01 -1.032249118 UCEC_C335297F_2D63_4973_9182_FA18C28E001E_X037_S04 -0.550676273 UCEC_D820B024_6B3B_4B5B_866E_F9A8139C270B_X039_S09 -0.036913872 > y1 TCGA-OR-A5K8-01A TCGA-PK-A5H8-01A TCGA-OR-A5J3-01A TCGA-OR-A5J6-01A TCGA- OR-A5KX-01A TCGA-OR-A5J2-01A 0.000000000 0.000000000 0.013752216 0.000000000 0.000000000 0.000000000 TCGA-OR-A5J9-01A TCGA-OR-A5JZ-01A TCGA-PA-A5YG-01A TCGA-CU-A3YL-01A TCGA- GD-A3OQ-01A TCGA-CF-A3MI-01A 0.009707204 0.000000000 0.000000000 0.000000000 0.000000000 0.119174367 ... TCGA-BL-A13J-01A TCGA-GV-A3JW-01A TCGA-DK-A1AD-01A TCGA-FD-A3SR-01A TCGA- CF-A1HR-01A TCGA-BL-A3JM-01A 0.019066953 0.355925504 0.019473742 0.062201816 0.081559894 0.243386421
After creating correct lm function, the result should look like this for example
> Call: lm(formula = y1 ~ x1 + x2 + x3 + x4 + x5 + x6 + x7 + x8 + x9 +
> x10 + x11 + x12 + x13 + x14)
>
> Residuals:
> Min 1Q Median 3Q Max
> -0.6282 -0.1130 -0.0257 0.0491 6.0798
>
> Coefficients:
> Estimate Std. Error t value Pr(>|t|) (Intercept) 0.054546 0.040219 1.356 0.1759 x1 0.145644 0.035340 4.121 4.66e-05 *** x2 0.005909 0.038020 0.155 0.8766 x3 -0.085892 0.051854 -1.656 0.0985 . x4 0.032686 0.029443 1.110 0.2677 x5 -0.047268 0.033388 -1.416 0.1577 x6 0.026735 0.032327 0.827 0.4088 x7 0.035673 0.051047 0.699 0.4851 x8 0.037374 0.060258 0.620 0.5355 x9 0.024493 0.053045 0.462 0.6445 x10 0.006623 0.059025 0.112 0.9107 x11 -0.017017 0.034501 -0.493 0.6221 x12 0.032184 0.046235 0.696 0.4868 x13 0.009988 0.033298 0.300 0.7644 x14 -0.017836 0.024505 -0.728 0.4672
> --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>
> Residual standard error: 0.3936 on 366 degrees of freedom Multiple
> R-squared: 0.2768, Adjusted R-squared: 0.2492 F-statistic: 10.01 on
> 14 and 366 DF, p-value: < 2.2e-16