1

I have time series data where I calculated disease severity in plants at different intervals. My response variable is between 0 & 1 (both exclusive), so beta regression seem to be the most appropriate model. My predictors are weather variables - the aim is to determine the effect of weather variables on disease severity. Here is a reproducible example.

df <- structure(
  list(
    year = structure(
      c(1L, 2L, 3L, 5L, 4L, 6L, 7L,
        8L, 9L, 10L),
      .Label = c(
        "2007",
        "2012",
        "2013",
        "2014",
        "2014.1",
        "2015.1",
        "2015.2",
        "2016",
        "2017",
        "2020"
      ),
      class = "factor"
    ),
    mean_rh = c(
      83.9025107032967,
      86.3309364921875,
      78.7225209283154,
      82.3598611111111,
      77.8125490392157,
      77.5694460507813,
      77.340364207483,
      78.601888359589,
      77.9234626042403,
      79.1268228283582
    ),
    mean_temp = c(
      8.06137667087912,
      3.75335204210526,
      8.39571386344086,
      7.57235900444444,
      10.5797098501961,
      8.52468121914062,
      9.63416591190476,
      12.2749429150685,
      9.65211886219081,
      10.1620900981343
    ),
    mean_ws = c(
      1.84656288406593,
      2.0747284924812,
      1.92455694623656,
      2.12702791944444,
      1.91802733215686,
      1.77915314179687,
      1.71631475340136,
      1.78024162876712,
      1.45554503710247,
      1.60409589440299
    ),
    total_rain = c(
      469.5,
      367.1,
      509.3,
      358.7,
      562.3,
      547.6,
      756.4,
      789.5,
      640.5,
      665.1
    ),
    severity = c(
      0.81667,
      0.01325,
      0.06125,
      0.81667,
      0.0198,
      0.0623,
      0.0035,
      0.00475,
      0.885,
      0.0348
    )
  ),
  class = c("grouped_df", "tbl_df", "tbl", "data.frame"),
  row.names = c(NA,-10L),
  groups = structure(
    list(
      year = structure(
        1:10,
        .Label = c(
          "2007",
          "2012",
          "2013",
          "2014",
          "2014.1",
          "2015.1",
          "2015.2",
          "2016",
          "2017",
          "2020"
        ),
        class = "factor"
      ),
      .rows = structure(
        list(1L,
             2L, 3L, 5L, 4L, 6L, 7L, 8L, 9L, 10L),
        ptype = integer(0),
        class = c("vctrs_list_of",
                  "vctrs_vctr", "list")
      )
    ),
    row.names = c(NA,-10L),
    class = c("tbl_df",
              "tbl", "data.frame"),
    .drop = TRUE
  )
)

My model is below

mod05 <-
  betareg(severity_ps ~ mean_temp +  mean_ws + total_rain + mean_rh,
          data = dat_preplanting)
summary(mod05)

However, when I checked the model residual autocorrelation using the acf() function, the plot is showing very high temporal autocorrelation.

enter image description here

However, when I check partial autocorrelation using pacf(mod05$residuals) function, no autocorrelation is detected.

My questions are,

  1. Is there autocorrelation problem if acf plot is showing autocorrelation but pacf plot is not?

  2. How to do I account for temporal autocorrelation in betareg R package? I checked the documentation but didn't find anything.

I tried to fit the model using glmmTMB & gam, including year as a random effect but the model failed to converge, suggesting that I have very limited degree of freedom. So I really need to find a way to account for temporal autocorrelation in betareg package. Thanks for any assistance.

Ahsk
  • 241
  • 1
  • 7
  • 3
    It would be really helpful to have a reproducible example. I think it's not possible to add temporal autocorrelation to a `betareg` model, but we might be able to help you with the `glmmTMB`/`gam` versions ... what was the code you used to generate your ACF plot? – Ben Bolker Nov 12 '22 at 01:38
  • Hi Ben, Thanks for the comment. I used this code acf(mod05$residuals). I wanted to use glmmTMB first because I am more familiar with glmmTMB. But the problem is the model fails to converge when I include year as a random effect. I can run a model without random effect, but that would be wrong because glmmTMB are essentially are mixed models? Can I fit a model without a random effect with glmmtmb, and include AR1 term? My data was collected in different years. Plants were harvested each year after the experiment, and new plants were planted each year. – Ahsk Nov 12 '22 at 15:10
  • I am not sure if that is time series because new plants were used each year. Although there were 2-3 disease assessments each year, we have only considered the final disease assessment in our study. The only reason I think it is time series because diseased plants were left out in the field to infect newly infected plants in the next season. – Ahsk Nov 12 '22 at 15:11
  • My glmmTMB model code is below (but I can't include random effect due to tiny data). And I don't know how to include AR1 structure for year mod05 <- glmmTMB(severity_ps ~ mean_temp + mean_ws + total_rain + mean_rh , family = beta_family(), data = dat_preplanting) – Ahsk Nov 12 '22 at 15:11
  • Just curious, if it's not possible to add temporal autocorrelation to `betareg` model, does that mean they are immune to temporal autocorrelation? I would also like to add that the total duration of my experiment in year varied – Ahsk Nov 12 '22 at 15:13
  • 1
    I still don't really understand what's going on here. ACF is usually computed with respect to some index variable (e.g. time), so `acf(mod05$residuals)` doesn't really make sense unless your data set is one long time series. On the other hand, I don't quite understand why you would get high autocorrelation in this case ... again, any chance of a **reproducible example** ? – Ben Bolker Nov 13 '22 at 15:49
  • @Ben Bolker Sorry, I didn't know how to create a reproducible example. I have watched a tutorial on YouTube on how to create a reproducible example using dput function. I have edited my question to include the dataset above. It's a very tiny dataset. I hope it will work. Thanks again for your time. – Ahsk Nov 13 '22 at 17:07
  • When I run the example with your data above I get a completely different ACF plot - autocorr for lag > 0 is very small, as expected ... People on SO don't prefer it, but for this kind of problem it might be easiest if you post your data somewhere publicly accessible and provide a link to it ... – Ben Bolker Nov 13 '22 at 18:26
  • @BenBolker How did you calculate ACF? Can I get the code, please? All I need to is to show there is no significant autocorrelation in my report. That's my complete data, as I have 11 data points only. Thanks – Ahsk Nov 13 '22 at 20:36

0 Answers0