3

To ease reproducibility, I am using the goats data set from the ResourceSelection package which has spatial data for used (STATUS == 1) and 'available' (STATUS == 0) GPS locations of mountain goats. ID is for individual (n = 10) and ELEVATION, ... , TASP are attributes of the points.

library(ResourceSelection)
head(goats)
  STATUS ID ELEVATION   SLOPE       ET   ASPECT       HLI      TASP
1      1  1       651 38.5216  35.3553 243.1131 0.9175926 0.9468804
2      1  1       660 39.6927  70.7107 270.0000 0.8840338 0.6986293
3      1  1       316 20.5477  50.0000 279.2110 0.7131423 0.5749115
4      1  1       334 34.0783  35.3553 266.1859 0.8643775 0.7447368
5      1  1       454 41.6187  25.0000 258.3106 0.9349181 0.8292587
6      1  1       343 28.4694 103.0776 237.0426 0.8254866 0.9756112

I would like to fit a glm to each individual grouped by Season (created below) but have the model structure vary according to season. I have been using a number of helpful SO posts and other resources but they all fit a single model to each group where as I would like to fit different models for each Season grouping.

#Add a new `Season` field 
library(tidyverse)

goats <- goats %>% 
  mutate(Season = if_else(ID %in% 1:3, "Summer",
                          if_else(ID %in% 4:7, "Winter", "Fall")))

Below I create a model building function and specify the model specific to each season using if else.

SeasonalMods <- function(df) {
  #Models for Archery
  if(Season == "Summer") {
    glm(STATUS ~ SLOPE + I(SLOPE^2), data = df)
  #Models for Winter  
  } else if (Season == "Winter") {
    glm(STATUS ~ SLOPE + ASPECT + TASP, data = df)
  #Models for Fall
   } else if (Season == "Fall") {
    glm(STATUS ~ ELEVATION + SLOPE + ET + ASPECT + HLI + TASP, data = df)}
  }

I then try to map the function to the grouped data and create new list-columns as follows.

ModelFits <- goats %>%
  group_by(Season, ID) %>% 
  nest() %>% 
  mutate(fits = map(data, SeasonalMods),
         tidied = map(fits, tidy),
         glanced = map(fits, glance),
         augmented = map(fits, augment))

Which generates the following error:

Error in mutate_impl(.data, dots) : 
  Evaluation error: object 'Season' not found

I am not sure how to correctly specify Season in the SeasonalMods function so that it is interpretable by map().

I tried to add df$ in front of Season within the if and else if statements, but this also generates an error.

B. Davis
  • 3,391
  • 5
  • 42
  • 78
  • Have you seen the `modelr` package? Not sure if it would help https://modelr.tidyverse.org/reference/fit_with.html – camille Mar 13 '19 at 21:48
  • Looking more closely at your function, the `if`s inside your function don't know that `Season` is a column in `df`. You could take `Season` as an argument. Keep in mind also that `if` statements only look at the first element in a list or vector, unlike the vectorized `ifelse` – camille Mar 13 '19 at 21:54
  • @camille `modelr` looks promising. I will look into that as well. As for the 2nd comment, yes, you are correct that I have not correctly identified `Season` but am not sure how to do so. How would I "take `Season` as an argument" as you suggest...? – B. Davis Mar 13 '19 at 21:59

1 Answers1

3

After I'd suggested using modelr, I went ahead without it. Like I said above, your function SeasonalMods doesn't know that Season is a column in the data frame that it takes as an argument, so you'll get errors that it's undefined. One way to do this instead is to add a second argument to the function to take the season. Since you're nesting your data, it's easy now to pass the data and season to your modeling function. I'm using map2 because the columns data and Season are the same length.

library(tidyverse)
library(broom)

All the internals of this function are the same—I've only added the second argument.

SeasonalMods <- function(df, Season) {
  ...
}

Just to illustrate the broom functions you used, I've added the tidied column, and save this data frame out:

models <- goats %>%
  group_by(Season, ID) %>%
  nest() %>%
  mutate(fits = map2(data, Season, ~SeasonalMods(.x, .y))) %>%
  mutate(tidied = map(fits, tidy))

head(models)
#> # A tibble: 6 x 5
#>   Season    ID data                 fits      tidied          
#>   <chr>  <int> <list>               <list>    <list>          
#> 1 Summer     1 <tibble [2,106 × 7]> <S3: glm> <tibble [3 × 5]>
#> 2 Summer     2 <tibble [1,668 × 7]> <S3: glm> <tibble [3 × 5]>
#> 3 Summer     3 <tibble [1,539 × 7]> <S3: glm> <tibble [3 × 5]>
#> 4 Winter     4 <tibble [951 × 7]>   <S3: glm> <tibble [4 × 5]>
#> 5 Winter     5 <tibble [1,908 × 7]> <S3: glm> <tibble [4 × 5]>
#> 6 Winter     6 <tibble [2,184 × 7]> <S3: glm> <tibble [4 × 5]>

Just to check that the models got the different formulas for different seasons:

models$fits[[1]]
#> 
#> Call:  glm(formula = STATUS ~ SLOPE + I(SLOPE^2), data = df)
#> 
#> Coefficients:
#> (Intercept)        SLOPE   I(SLOPE^2)  
#>   -0.042618    -0.000989     0.000375  
#> 
#> Degrees of Freedom: 2105 Total (i.e. Null);  2103 Residual
#> Null Deviance:       468 
#> Residual Deviance: 337.2     AIC: 2127

models$fits[[6]]
#> 
#> Call:  glm(formula = STATUS ~ SLOPE + ASPECT + TASP, data = df)
#> 
#> Coefficients:
#> (Intercept)        SLOPE       ASPECT         TASP  
#>    0.024625     0.017838    -0.001768     0.215217  
#> 
#> Degrees of Freedom: 2183 Total (i.e. Null);  2180 Residual
#> Null Deviance:       485.3 
#> Residual Deviance: 385.7     AIC: 2421
camille
  • 16,432
  • 18
  • 38
  • 60
  • This is great. Am I correct that `map2` is only needed when mapping to `data` and `Season`? When creating the `tidied`, `glanced`, and `augmented` list-columns above it seems correct to use `map`, correct? – B. Davis Mar 13 '19 at 22:29
  • Yup. `map2` lets you map across the data and the season. After that, the `broom` functions only need to work on the `fits` list, so `map` is fine – camille Mar 13 '19 at 22:48