1

I am running a GAM across many samples and am extracting coefficients/t-values/r-squared from the results in the way shown below. For background, I am using a natural spline, so the regular lm() works fine here and perhaps that is why this method works fine.

tvalsm93exf=ldply(fitsm93exf, function(x) as.data.frame(t(coef(summary(x))[,'t value', drop=FALSE]))) 
r2m93exf=ldply(fitsm93exf, function(x) as.data.frame(t(summary(x))[,'r.squared', drop=FALSE]))

I would also like to extract the knot locations for each sample set(df=4 and no intercept, so three internal knots and the boundaries). I have tried several variations of the commands above, but haven't been able to index in to this. The regular way to do this is below, so I was attempting to put this into the form above. But I am not certain if the summary function contains these values, or if there is another result I should be including instead.

attr(terms(fits),"predvars") 

http://www.inside-r.org/r-doc/splines/ns

Note: This question is related to the question below, if that helps, though its solution did not help me solve my problem: Extract estimates of GAM

Community
  • 1
  • 1
Z_D
  • 797
  • 2
  • 12
  • 30

1 Answers1

0

The knots are fixed at the time that the ns function is called in the examples on the help page you linked to, so you could have extracted the knots without going into the model object. But ... you have not provided the code for the GAM model creation, so we can only speculate about what you might have done. Just because the word "spline" is used in both the ?ns-help-page and in the documentation does not mean they are the same. The model in the other page you linked to had two "smooth" terms constructed wtih the s function.

  .... + s(time,bs="cr",k=200) + s(tmpd,bs="cr")

The result of that gam call had a list node named "smooth" and the first one looked like this when viewed with str():

str(ap1$smooth)
List of 2
 $ :List of 22
  ..$ term          : chr "time"
  ..$ bs.dim        : num 200
  ..$ fixed         : logi FALSE
  ..$ dim           : int 1
  ..$ p.order       : logi NA
  ..$ by            : chr "NA"
  ..$ label         : chr "s(time)"
  ..$ xt            : NULL
  ..$ id            : NULL
  ..$ sp            : Named num -1
  .. ..- attr(*, "names")= chr "s(time)"
  ..$ S             :List of 1
  .. ..$ : num [1:199, 1:199] 5.6 -5.475 2.609 -0.577 0.275 ...
  ..$ rank          : num 198
  ..$ null.space.dim: num 1
  ..$ df            : num 199
  ..$ xp            : Named num [1:200] -2556 -2527 -2502 -2476 -2451 ...
  .. ..- attr(*, "names")= chr [1:200] "0.0000000%" "0.5025126%" "1.0050251%" "1.5075377%" ...
  ..$ F             : num [1:40000] 0 0 0 0 0 0 0 0 0 0 ...
  ..$ plot.me       : logi TRUE
  ..$ side.constrain: logi TRUE
  ..$ S.scale       : num 9.56e-05
  ..$ vn            : chr "time"
  ..$ first.para    : num 5
  ..$ last.para     : num 203
  ..- attr(*, "class")= chr [1:2] "cr.smooth" "mgcv.smooth"
  ..- attr(*, "qrc")=List of 4
  .. ..$ qr   : num [1:200, 1] -0.0709 0.0817 0.0709 0.0688 0.0724 ...
  .. ..$ rank : int 1
  .. ..$ qraux: num 1.03
  .. ..$ pivot: int 1
  .. ..- attr(*, "class")= chr "qr"
  ..- attr(*, "nCons")= int 1

So the smooth was evaluated at each of 200 points and a polynomial function fit to the data on that grid. If you forced the knots to be at three interior locations then they will just be at the extremes and evenly spaced location between the extremes.

IRTFM
  • 258,963
  • 21
  • 364
  • 487
  • Thanks for your help. I understand that the other post had a different form of spline, I was simply trying to show that I was aware of similar discussions that had taken place here. You are correct, though, that setting dfs here forces the model to form equal quartiles of the data. So I can solve my problem by calling something that extracts quartiles of each data sample. – Z_D Oct 23 '14 at 12:46