1

I have a data.frame that contains a date column, a type column, a y column, then 3 explanatory variables (x1, x2, x3). The date column contains observations of y and the 3 explanatory variables for each type at every month.

# A tibble: 6 x 6
  date       type           y    x1    x2    x3
  <date>     <chr>      <dbl> <dbl> <dbl> <dbl>
1 1926-07-01 Small; Low  3.78 1.28   1.44 0.431
2 1926-07-01 Small; 2   -0.41 1.07   1.52 0.238
3 1926-07-01 Small; 3   -1.94 1.05   1.25 0.521
4 1926-07-01 Small; 4    0.35 0.944  1.20 0.589
5 1926-08-01 Small; Low -2.21 1.28   1.44 0.431
6 1926-08-01 Small; 2   -8.73 1.07   1.52 0.238

I would like to run t regressions where t is the number of months. Each regression would then be y ~ x1 x2 x3 for all the y's and explanatory variables for each month.

sample %>% group_by(date) %>% map(~lm(#not sure))

I have been trying to use dyplr's group_by(date) but am not sure how to access each column. Lastly, the number of explanatory variables may change so I do not want to reference them by name but rather as all columns other than date, type, and y.

structure(list(date = structure(c(-15890, -15890, -15890, -15890, 
-15859, -15859, -15859, -15859, -15828, -15828, -15828, -15828, 
-15798, -15798, -15798, -15798), class = "Date"), type = c("Small; Low", 
"Small; 2", "Small; 3", "Small; 4", "Small; Low", "Small; 2", 
"Small; 3", "Small; 4", "Small; Low", "Small; 2", "Small; 3", 
"Small; 4", "Small; Low", "Small; 2", "Small; 3", "Small; 4"), 
    y = c(3.78, -0.41, -1.94, 0.35, -2.21, -8.73, 2.44, 0.61, 
    -6.21, -0.3, -6.2, -1.64, -8.62, -3.75, -5.67, 5.72), x1 = c(1.28446741361197, 
    1.07356662903464, 1.04832788500252, 0.943639719875559, 1.28446741361197, 
    1.07356662903464, 1.04832788500252, 0.943639719875559, 1.28446741361197, 
    1.07356662903464, 1.04832788500252, 0.943639719875559, 1.28446741361197, 
    1.07356662903464, 1.04832788500252, 0.943639719875559), x2 = c(1.44263144125512, 
    1.52425318619375, 1.24887757539023, 1.20367883758503, 1.44263144125512, 
    1.52425318619375, 1.24887757539023, 1.20367883758503, 1.44263144125512, 
    1.52425318619375, 1.24887757539023, 1.20367883758503, 1.44263144125512, 
    1.52425318619375, 1.24887757539023, 1.20367883758503), x3 = c(0.430576566887732, 
    0.237649845254604, 0.520660917641051, 0.588602620999144, 
    0.430576566887732, 0.237649845254604, 0.520660917641051, 
    0.588602620999144, 0.430576566887732, 0.237649845254604, 
    0.520660917641051, 0.588602620999144, 0.430576566887732, 
    0.237649845254604, 0.520660917641051, 0.588602620999144)), row.names = c(NA, 
-16L), class = c("tbl_df", "tbl", "data.frame"))
cpage
  • 119
  • 6
  • 27
  • How you would select the explanatory variables such that you are passing all columns except date, type, and y? (that solution works by calling the columns by name) – cpage Oct 02 '18 at 02:51
  • Got it. Mark as duplicate. – cpage Oct 02 '18 at 02:57

0 Answers0