Generating tidy model statistics using purrr::possibly() within dplyr::group_modify

Question

I am trying to fit many exponential models in testData, however, do to variation in patterns of mass gain among testData$ID, this model will not succeed for many individuals. The goal was to produce a broom::tidy data frame where all the statistics on testData$ID's with successful models are reported, and any individuals without models appear in the dataframe with NA's for the model statistics. Below is a sample of data that will fail to model for any of the IDs. I have looked at this post, which taught me that I had to wrap purrr::possibly() around the function throwing the error, rather than broom::tidy(), so I scrapped ExpMod_1. The error for this was:

Error: The result of .f should be a data frame.

I also looked at this post, which taught me that possibly was indeed returning a function that "accepts the same arguments as its input" (- user3603486), not just the value specified by the otherwise argument...this is confusing to me because it seems like in that answer, NA_character_ is just a static value anyways. The error thrown on this run is:

Error: No tidy method for objects of class function

I have attached some example data and code to demonstrate my problem.

require(tidyverse)
require(broom)

testDat<- tibble(Mass = rnorm(n = 100, mean = 3.5, sd = 0.5),
                 Days_to_Departure = sample(x = (c(1:14)),size = 100, replace = T),
                 ID = sample(x = c(1:4), size = 100, replace = T))
ExpMod_1<-testDat %>% 
  group_by(ID) %>%
  group_modify(.f = ~possibly(~tidy(nls(Mass_Visit ~ a*exp(-b*Days_to_Departure) + c, data = .x,
                                        start = list(a=1.2, b=0.5, c = 3.5),
                                        control = list(maxiter = 500),
                                        trace = T)),otherwise = ~tibble(estimate    = c(NA_real_),
                                                                       p.value      = c(NA_real_),
                                                                       statistic    = c(NA_real_),
                                                                       std.error    = c(NA_real_),
                                                                       term = c(NA_character_))))
ExpMod_2<-testDat %>% 
  group_by(ID) %>%
  group_modify(.f = ~tidy(possibly(~nls(Mass_Visit ~ a*exp(-b*Days_to_Departure) + c, data = .x,
                                        start = list(a=1.2, b=0.5, c = 3.5),
                                        control = list(maxiter = 500),
                                        trace = T),otherwise = ~list(m = NA_character_))))

Both of these errors make sense to me. I want to know if it is possible to nest these functions as they are packaged or if what I am trying to do would require writing a different function. The trouble is that I am trying to fit many models and I expect many failures, which I will address with different models...nls, lm, nlme, lmer, etc etc. But I want to fit and compare all of these, so I need to know what succeeds, where, and when.

Thanks in advance, any suggestions or feedback would be very much appreciated.

score 1 · Accepted Answer · answered Dec 17 '20 at 06:43

I find syntax of possibly very confusing. Here is a way to do this with tryCatch :

library(tidyverse)

testDat %>% 
  group_by(ID) %>%
  summarise(data = list(tryCatch({
    tidy(nls(Mass_Visit ~ a*exp(-b*Days_to_Departure) + c, data = .x,
            start = list(a=1.2, b=0.5, c = 3.5),
            control = list(maxiter = 500),
            trace = T))
    }, error = function(e) {
      tibble(estimate    = c(NA_real_),
             p.value      = c(NA_real_),
             statistic    = c(NA_real_),
             std.error    = c(NA_real_),
            term = c(NA_character_))
      })))

#     ID data            
#  <int> <list>          
#1     1 <tibble [1 × 5]>
#2     2 <tibble [1 × 5]>
#3     3 <tibble [1 × 5]>
#4     4 <tibble [1 × 5]>

Fantastic! With some minor tweaks this did exactly what I was looking for. For reference, the adjustments were the following: `group_modify(~tryCatch({tidy(nls(Mass_Visit ~ a*exp(-b*Days_to_Departure) + c, data = .x,` After the call to group the data. The group modify function was preferred because of the tibble-in/tibble-out behaviour (no need to nest and unnest). Also, summarize looked like it was getting confused about which data to work on with '.'. Very excited to have this working-- thank you! — SGE, Dec 17 '20 at 17:09

Generating tidy model statistics using purrr::possibly() within dplyr::group_modify

1 Answers1