2

I'm trying to use the ns() function from the splines package with a poisson GLM I am using to test for significance of particulate matter concentration (pm.lag0) on health outcomes (Freq):

   > gfit4 = glm(Freq ~ pm.lag0 + ns(date, df=2), family = poisson(), 
                 data = dt,  offset = log(pop))

I get these errors back:

Error in splineDesign(knots, x, ord, derivs, outer.ok = outer.ok) : 
  must have at least 'ord' knots
In addition: Warning message:
In sort(as.numeric(knots)) : NAs introduced by coercion

Is that not a valid use of ns()? Can someone help me decode this error message? The splines documentation that R provides doesn't seem to match this error (?ns).

Mogsdad
  • 44,709
  • 21
  • 151
  • 275
mEvans
  • 905
  • 4
  • 15
  • 18
  • Welcome to StackOverflow! Before you ask many more questions, may I suggest that you spend a few minutes reading a particular section of the [FAQ](http://stackoverflow.com/faq#howtoask). – joran Mar 20 '12 at 02:02
  • @joran- sure i will- i definitely don't want to violate the StackOverflow policies. Is there something specific you were pointing out that I'm doing wrong? – mEvans Mar 20 '12 at 02:08
  • probably http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example – Ben Bolker Mar 20 '12 at 02:21
  • What Ben said, plus the fact that it's really helpful (but by no means required) if folks accept the answer that solved their problem. – joran Mar 20 '12 at 02:42
  • Oh, and you weren't violating anything, it was just a friendly suggestion. – joran Mar 20 '12 at 02:44
  • 4
    @mEvans The point about a reproducible example is that it allows someone else (like me) to,um, reproduce your problem. Then I can read the source code, read the help page etc. If you had posted a reproducible example, I may have investigated. But since you haven't, I am not inclined to spend the time to build my own example and trace the problem. So I'm ignoring the question and moving on. The point is that you may be lucky and somebody happens to know the answer. Or you can provide more information and possibly get someone interested enough to help you. – Andrie Mar 20 '12 at 06:14
  • 1
    @mEvans please don't presume to know when it is or is not helpful to have a reproducible example. I have hazarded a guess based on my existing knowledge of the functions involved. Others would have been willing to trace the execution of the example to identify the error even if the didn't know what the error might be. Without the reproducible example, as Andrie points illustrates, you will lose the input from that large section of the community, thus restricting yourself to a much smaller subset who have (or, like me think they might have) the required domain-specific knowledge. – Gavin Simpson Mar 20 '12 at 08:44

2 Answers2

6

I can't see a reason why, in principle, it is not possible to use ns() in glm(). To see why, study what ns() does in a formula. From ?ns

> model.frame(weight ~ ns(height, df = 5), data = women)
   weight ns(height, df = 5).1 ns(height, df = 5).2 ns(height, df = 5).3 ns(height, df = 5).4 ns(height, df = 5).5
1     115         0.000000e+00         0.000000e+00         0.000000e+00         0.000000e+00         0.000000e+00
2     117         7.592323e-03         0.000000e+00        -8.670223e-02         2.601067e-01        -1.734045e-01
3     120         6.073858e-02         0.000000e+00        -1.503044e-01         4.509132e-01        -3.006088e-01
4     123         2.047498e-01         6.073858e-05        -1.677834e-01         5.033503e-01        -3.355669e-01
5     126         4.334305e-01         1.311953e-02        -1.324404e-01         3.973211e-01        -2.648807e-01
6     129         6.256681e-01         8.084305e-02        -7.399720e-02         2.219916e-01        -1.479944e-01
7     132         6.477162e-01         2.468416e-01        -2.616007e-02         7.993794e-02        -5.329196e-02
8     135         4.791667e-01         4.791667e-01         1.406302e-02         2.031093e-02        -1.354062e-02
9     139         2.468416e-01         6.477162e-01         9.733619e-02         2.286023e-02        -1.524015e-02
10    142         8.084305e-02         6.256681e-01         2.707683e-01         6.324188e-02        -4.052131e-02
11    146         1.311953e-02         4.334305e-01         4.805984e-01         1.252603e-01        -5.240872e-02
12    150         6.073858e-05         2.047498e-01         5.954160e-01         1.989926e-01         7.809246e-04
13    154         0.000000e+00         6.073858e-02         5.009718e-01         2.755102e-01         1.627794e-01
14    159         0.000000e+00         7.592323e-03         2.246113e-01         3.520408e-01         4.157556e-01
15    164         0.000000e+00         0.000000e+00        -1.428571e-01         4.285714e-01         7.142857e-01

Which shows that it provides the B-spline basis for the natural spline of the height variable. Nothing special here.

Hence I suspect your date variable is not numeric or not something that R could work with by coercing it to be numeric without introducing NA - see the warning message. Without a reproducible example and information on your data it is impossible to tell however!

In addition, you might want to look at the gam() function in package mgcv which as a Recommended package is distributed with R. It is designed to fit semi-parametric models in the manner you describe and can include parametric terms as well as smooths/splines of other terms. The package is fairly comprehensive and can fit a large number of types of spline. See it's manual.

Gavin Simpson
  • 170,508
  • 25
  • 396
  • 453
1

Read the help page for the ns function ?ns, under the section on the df argument it includes:

One can supply ‘df’ rather than knots;
          ‘ns()’ then chooses ‘df - 1 - intercept’ knots at suitably
          chosen quantiles of ‘x’

And since you specified 2 degrees of freedom and did not supress the default intercept that means that you asked it to fit a spline with 0 knots, which it does not know how to do. Try specifying a larger number for the df and it should work for you.

Greg Snow
  • 48,497
  • 6
  • 83
  • 110