I have a data.frame
that contains a date column, a type column, a y column, then 3 explanatory variables (x1, x2, x3). The date column contains observations of y and the 3 explanatory variables for each type at every month.
# A tibble: 6 x 6
date type y x1 x2 x3
<date> <chr> <dbl> <dbl> <dbl> <dbl>
1 1926-07-01 Small; Low 3.78 1.28 1.44 0.431
2 1926-07-01 Small; 2 -0.41 1.07 1.52 0.238
3 1926-07-01 Small; 3 -1.94 1.05 1.25 0.521
4 1926-07-01 Small; 4 0.35 0.944 1.20 0.589
5 1926-08-01 Small; Low -2.21 1.28 1.44 0.431
6 1926-08-01 Small; 2 -8.73 1.07 1.52 0.238
I would like to run t regressions where t is the number of months. Each regression would then be y ~ x1 x2 x3
for all the y's and explanatory variables for each month.
sample %>% group_by(date) %>% map(~lm(#not sure))
I have been trying to use dyplr
's group_by(date)
but am not sure how to access each column. Lastly, the number of explanatory variables may change so I do not want to reference them by name but rather as all columns other than date, type, and y.
structure(list(date = structure(c(-15890, -15890, -15890, -15890,
-15859, -15859, -15859, -15859, -15828, -15828, -15828, -15828,
-15798, -15798, -15798, -15798), class = "Date"), type = c("Small; Low",
"Small; 2", "Small; 3", "Small; 4", "Small; Low", "Small; 2",
"Small; 3", "Small; 4", "Small; Low", "Small; 2", "Small; 3",
"Small; 4", "Small; Low", "Small; 2", "Small; 3", "Small; 4"),
y = c(3.78, -0.41, -1.94, 0.35, -2.21, -8.73, 2.44, 0.61,
-6.21, -0.3, -6.2, -1.64, -8.62, -3.75, -5.67, 5.72), x1 = c(1.28446741361197,
1.07356662903464, 1.04832788500252, 0.943639719875559, 1.28446741361197,
1.07356662903464, 1.04832788500252, 0.943639719875559, 1.28446741361197,
1.07356662903464, 1.04832788500252, 0.943639719875559, 1.28446741361197,
1.07356662903464, 1.04832788500252, 0.943639719875559), x2 = c(1.44263144125512,
1.52425318619375, 1.24887757539023, 1.20367883758503, 1.44263144125512,
1.52425318619375, 1.24887757539023, 1.20367883758503, 1.44263144125512,
1.52425318619375, 1.24887757539023, 1.20367883758503, 1.44263144125512,
1.52425318619375, 1.24887757539023, 1.20367883758503), x3 = c(0.430576566887732,
0.237649845254604, 0.520660917641051, 0.588602620999144,
0.430576566887732, 0.237649845254604, 0.520660917641051,
0.588602620999144, 0.430576566887732, 0.237649845254604,
0.520660917641051, 0.588602620999144, 0.430576566887732,
0.237649845254604, 0.520660917641051, 0.588602620999144)), row.names = c(NA,
-16L), class = c("tbl_df", "tbl", "data.frame"))