3

I'm trying to predict a negative binomial model to a stack of rasters using the predict function in the raster package. I need to include an offset term to normalize my count variable. I have tried unsuccessfully to get this to work using a method where the offset term is included in the model like this:

condor.glm <- glm.nb(y_count ~ logsafefood + logpigharvest +
                       logintdist + houseden + pubforest + pubrange + 
                       privforest + privrange + offset(log(offset)), 
                     data=merge.Hex, control=glm.control(maxit=1000))

predict(rasStack2, condor.glm, filename="cencal_predictlow_model15.img", 
        overwrite=TRUE, type="response", progress="text",na.rm=TRUE)

R throws this error:

Error in log(offset) : non-numeric argument to mathematical function

If I pass offset as an argument instead:

condor.glm <- glm.nb(y_count ~ logsafefood + logpigharvest + 
                       logintdist + houseden + pubforest + pubrange + 
                       privforest + privrange, 
                     offset=log(offset), 
                     data=merge.Hex, control=glm.control(maxit=1000))

predict(rasStack2, condor.glm, filename="cencal_predictlow_model15.img",  
        overwrite=TRUE, type="response", progress="text",na.rm=TRUE)

R creates the prediction surfaces but the values are all equal.

Any suggestions for how to do this correctly?

jbaums
  • 27,115
  • 5
  • 79
  • 119
  • 3
    Welcome to SO. Do you have a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example) for your question? – Richard Erickson Feb 26 '16 at 20:05
  • Please include some sample data with your question to make it easier to help. Also, the more minimal the example the better. Use only columns necessary to reproduce the problem. – MrFlick Feb 26 '16 at 21:26
  • What does this return: `str(offset)`? – IRTFM Feb 26 '16 at 23:03
  • > str(merge.Hex$offset) num [1:287] 1807 1807 1807 1807 1807 ... – Holly Copeland Feb 28 '16 at 14:13
  • Here is a copy of my sample data for the input model: – Holly Copeland Feb 28 '16 at 14:15
  • I am fairly certain that the source of the error is from stats::predict.glm and not raster::predict. Can you please test this by back predicting to the x matrix used in the model? In using raster predict you are calling two predict functions. The raster predict is a wrapper for whatever model you pass it. Because of this you must make sure that your prediction call for the model is working before calling raster::predict. If this is the case please edit your answer to remove the raster part as it is convolving the problem. BTW, are your offset values unique length(unique(merge.Hex$offset)) ? – Jeffrey Evans Feb 29 '16 at 16:28
  • Thanks, Jeff. I will try back predicting. To answer your question, my offset value is not unique to each sample unit, as I am trying to adjust the y_count in each sample unit of my dataset (row) to the total number of GPS locations (in the example above, I'm using offset = log(1807)). Perhaps I'm not understanding the offset term correctly? – Holly Copeland Mar 01 '16 at 03:01
  • So, I tried back predicting the model to the spatial polygon data frame (x matrix) that I used to create the model and predict worked without errors and produced the result that I was expecting. – Holly Copeland Mar 01 '16 at 13:30
  • Holly, if you want to add data (it would help), [edit your post](http://stackoverflow.com/posts/35660447/edit) and add it there, rather than in comments. – jbaums Mar 02 '16 at 13:08

1 Answers1

1

The problem is not related to the raster::predict function but rather your data. If you back predict the data you will see that it results in uniform values.

The issue is that you have a column "offset" named the same as a function "stats::offset. If you simply rename the "offset" column the model call for both your syntax + log(offsets) or offset = log(offsets) works.

To get the raster::predict function to work you will either have to fiddle with the “const” argument or create a constant raster, that can be added to your raster stack, with a uniform value of log(1807), which would be the easiest approach.

Jeffrey Evans
  • 2,325
  • 12
  • 18