11

I’m writing a GAM using the mgcv package that predicts burrow abundance and distribution of two different species on an island using data obtained during a field trip and images taken from the Sentinel satellite. 101 plots were surveyed. 922 burrows belonging to species 1 were recorded in 66 plots and 29 burrows belonging to species 2 were recorded in 8 plots.

I used a negative binomial distribution for species 1 as using a Poisson distribution resulted in the model being over dispersed. The maximal model was:

gam(Species_1 ~ s(x, y, bs="ts") +
                    Sentinel2_band_1 + Sentinel2_band_2 + Sentinel2_band_3 + Sentinel2_band_4 + Sentinel2_band_5 +
                    Sentinel2_band_6 + Sentinel2_band_7 + Sentinel2_band_8 + Sentinel2_band_9 + Sentinel2_band_10 +
                    I(Sentinel2_band_1^2) + I(Sentinel2_band_2^2) + I(Sentinel2_band_3^2) + I(Sentinel2_band_4^2) + I(Sentinel2_band_5^2) +
                    I(Sentinel2_band_6^2) + I(Sentinel2_band_7^2) + I(Sentinel2_band_8^2) + I(Sentinel2_band_9^2) + I(Sentinel2_band_10^2) +
                    aspect + elevation + slope +
                    I(aspect^2) + I(elevation^2) + I(slope^2) +
                    aspect:elevation + aspect:slope + elevation:slope,
                  data = dat,
                  family = nb(1))

The model selection process has resulted in a model that gives acceptable results.

When I run the same model using species 2 as the response variable I get the following error message:

Warning message:
In newton(lsp = lsp, X = G$X, y = G$y, Eb = G$Eb, UrS = G$UrS, L = G$L,  :
  Fitting terminated with step failure - check results carefully

The diagnostic plots also look pretty dodgy:

enter image description here

My assumption the issue I’m encountering is due to the much smaller sample size for species 2.

Any ideas what I can do to resolve this problem?

Bong112
  • 161
  • 10
  • 2
    This is an older question, but thought I would post that there is a similar question with an answer here: https://stats.stackexchange.com/questions/576273/gam-model-warning-message-step-failure-in-theta-estimation – Mark Thompson Oct 26 '22 at 00:36
  • 1
    Thanks for the link Mark. I'll look over it later. I got around this issue by reducing the area that the model predicted for. The species that we had the smaller sample size for was only present in the south-west corner of the island. Setting up the model to only predict for the west side of the island resulted in us being able to create an acceptible model. – Bong112 Oct 26 '22 at 07:41
  • Okay...thanks for letting me know. I am running similar models. I used the Duchon spline (bs = "ds") for my x,y per Gavin Simpson's recommendation on this. If I understand correctly, it allows x and y to vary whereas the thin-plate spline (bs = "ts") treats these as isotropic values. I had not thought of using the Sentinel bands directly, but calculated vegetation indices instead. How did this work out for your analysis? – Mark Thompson Oct 26 '22 at 16:48
  • 1
    Running the model the individual bands produced decent results. However, we ended up switching over to vegetation indices (NDVI) as it improved model performance. I have seen some published papers that used the individual bands so it might be a good idea to try both and with the approach that works best with your dataset. A modified version of the model described in this thread was used for this paper - https://link.springer.com/article/10.1007/s00300-021-02842-3 – Bong112 Oct 28 '22 at 07:18

0 Answers0