6

In the R package spatstat (I am using the current version, 1.31-0) , there is an option use.gam. When you set this to true, you can include smooth terms in the linear predictor, the same way you do with the R package mgcv. For example,

g <- ppm(nztrees, ~1+s(x,y), use.gam=TRUE) 

Now, if I want a confidence interval for the intercept, you can usually use summary or vcov, which works when you don't use gam but fails when you do use gam

vcov(g)

which gives the error message

Error in model.frame.default(formula = fmla, data = 
    list(.mpl.W = c(7.09716796875,  :invalid type (list) for variable 's(x, y)'

I am aware that this standard error approximation here is not justified when you use gam, but this is captured by the warning message:

In addition: Warning message: model was fitted by gam();
            asymptotic variance calculation ignores this 

I'm not concerned about this - I am prepared to justify the use of these standard errors for the purpose I'm using them - I just want the numbers and would like to avoid "writing-my-own" to do so.

The error message I got above does not seem to depend on the data set I'm using. I used the nztrees example here because I know it comes pre-loaded with spatstat. It seems like it's complaining about the variable itself, but the model clearly understands the syntax since it fits the model (and the predicted values, for my own dataset, look quite good, so I know it's not just pumping out garbage).

Does anybody have any tips or insights about this? Is this a bug? To my surprise, I've been unable to find any discussion of this online. Any help or hints are appreciated.

Edit: Although I have definitively answered my own question here, I will not accept my answer for the time being. That way, if someone is interested and willing to put in the effort to find a "workaround" for this without waiting for the next edition of spatstat, I can award the bounty to him/her. Otherwise, I'll just accept my own answer at the end of the bounty period.

Macro
  • 1,450
  • 1
  • 10
  • 18
  • I get the same error message if I just try to print the g to the screen. It seems that the problem occurs when calling the model.frame function. With simple debugging the error seems to occur in line data <- .Internal(model.frame(formula, rownames, variables, varnames, extras, extranames, subset, na.action)) – Jouni Helske Feb 27 '13 at 20:11
  • Hi @Hemmo, I can see that. But, the model still is estimated (e.g. `coef(g)` works) and you can plot predicted values, etc. (although, when you try to get standard errors for the predictions, you're back to this error). Any tips? – Macro Feb 28 '13 at 03:42
  • Quick looking at the code of ppm and mpl.engine, I would say that ppm and it's subfunctions do not use model.frame approach. It saves formula and data into the output (g$internal), but default formula parsing /model.frame.default cannot handle list s(x,y) as there is no such thing in the data frame. My guess is that this is a bug, and you should ask this from the package author. You could also test this with older version of the package and see if you get the same error. – Jouni Helske Feb 28 '13 at 06:34

2 Answers2

4

I have contacted one of the package authors, Adrian Baddeley, about this. He promptly responded and let me know that this is indeed a bug with the software and, apparently, I am the first person to encounter it. Fortunately, it only took him a short time to track down the issue and correct it. The fix will be included in the next release of spatstat, 1.31-1.

Edit: The updated version of spatstat has been released and does not have this bug anymore:

g <- ppm(nztrees, ~1+s(x,y), use.gam=TRUE)
sqrt( vcov(g)[1,1] ) 
[1] 0.1150982
Warning message:
model was fitted by gam(); asymptotic variance calculation ignores this 

See the spatstat website for other release notes. Thanks to all who read and participated in this thread!

Macro
  • 1,450
  • 1
  • 10
  • 18
1

I'm not sure you can specify the trend the way you have which is possibly what is causing the error. It doesn't seem to make sense according to the documentation:

The default formula, ~1, indicates the model is stationary and no trend is to be fitted.

But instead you can specify the model like so:

g <- ppm(nztrees, ~x+y, use.gam=TRUE)   
#Then to extract the coefficientss:
>coef(g)
(Intercept)             x             y 
-5.0346019490  0.0013582470 -0.0006416421 
#And calculate their se:
vc <- vcov(g)
se <- sqrt(diag(vc))
> se
(Intercept)           x           y 
0.264854030 0.002244702 0.003609366 

Does this make sense/expected result? I know that the package authors are very active on the r-sig-geo mailing lsit as they have helped me in the past. You may also want to post your question to that mailing list, but you should reference your question here when you do.

Simon O'Hanlon
  • 58,647
  • 14
  • 142
  • 184
  • Hi @Simon, thanks for your attention. The `s(x,y)` is `gam` syntax for specifying the effect of `x,y` by a non-parametric smooth function (that is estimated). See the [gam documentation](http://127.0.0.1:31246/library/mgcv/html/gam.html). Note that the model *is* estimated as a functional parameter when you run my code (e.g. plot the predicted surface - `plot(predict(g))`), but it seems that the link with `gam` is incomplete, since SEs for the non-smooth terms aren't available. The model you've fit is a usual log-linear in `x` and `y` and I have had success getting those standard errors, etc. – Macro Feb 28 '13 at 18:22
  • @Macro ah I should have paid more attention. I will look into it further to see if I can help, but I strongly advise to post to r-sig-geo, and I strongly advise to read the [posting guide](http://www.r-project.org/posting-guide.html) first. I have been slated for a poorly formed question the mailing list before! – Simon O'Hanlon Feb 28 '13 at 18:25
  • Non-local [gam documentation](http://finzi.psych.upenn.edu/R/library/mgcv/html/gam.html) for anyone else following this – Simon O'Hanlon Feb 28 '13 at 18:27