1

I'm trying to find a structural break in my time series using the breakpoints() function (in the strucchange package). My goal is to find where is "knot" in my dataset. I'm looking for a procedure which would test all possible knots and choose the one who minimize an information criterion such AIC or BIC. breakpoints() does a good job but I would like to draw a continuous piecewise linear function. This, I would like the intercept to be the same before and after the breakpoint. Is anyone aware of a function or an option to do this ?

On the picture below, the red line is the true model and the blue line is fitted using breakpoints(). I would like a procedure which would fit the true model (no jump at the breakpoint).

See my gist file to reproduce this example.

enter image description here

PAC
  • 5,178
  • 8
  • 38
  • 62
  • 3
    Please provide a [reproducible example](http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example). – Thomas Aug 01 '13 at 14:23
  • In addition to not including `require(strucchange)` which I assume is where you are getting `breakpoints` in your code, you also have a function `ymd` that is not loaded. – IRTFM Aug 01 '13 at 15:18
  • @DWin, Thanks, the gist file has been updated – PAC Aug 01 '13 at 15:20

2 Answers2

2

The 'strucchange' package seems designed to return discontinuous results. You may want to look at packages that are designed the way you imagine the result to be structured. The 'segmented' package is one such.

require(segmented)
out.lm<-lm(y~date3,data=df)
o<-segmented(out.lm, seg.Z= ~date3, psi=list(date3=c(-10)),
     control=seg.control(display=FALSE))
 slope(o)
#$date3
#          Est. St.Err. t value CI(95%).l CI(95%).u
#slope1  0.3964  0.1802   2.199   0.03531    0.7574
#slope2 -1.6970  0.1802  -9.418  -2.05800   -1.3360

str(fitted(o))
# Named num [1:60] 1.94 2.34 2.74 3.13 3.53 ...
# - attr(*, "names")= chr [1:60] "1" "2" "3" "4" ...
plot(y ~ date3, data=df)
lines(fitted(o) ~ date3, data=df)

enter image description here

IRTFM
  • 258,963
  • 21
  • 364
  • 487
0

A continuous piecewise linear fit is also called a linear spline, and can be fit with bs in the splines package (comes with base R).

lm(y ~ bs(x, deg=1, df, knots), ...)

The breakpoints are called knots, and you have to specify them via either the knots argument or the df argument (which chooses knots based on the quantiles of x).

You can also do it manually; linear splines are particularly simple to code up.

lm(y ~ x + pmax(0, x - k1) + pmax(0, x - k2), ...)

where k1 and k2 are the knots. Add more to taste.

Hong Ooi
  • 56,353
  • 13
  • 134
  • 187
  • To my knowledge, `bs()` doesn't find the knots which minimize an information criterion but take arbitrary knots such as quantiles of x. It doesn't tell me where exactly is the breakpoint. – PAC Aug 01 '13 at 15:05
  • Correct. But you didn't specify this in your question, did you? Might be an interesting exercise to write code to choose knots to minimise such a criterion.... – Hong Ooi Aug 01 '13 at 15:06
  • You could probably just use the output from `breakpoints` to set the knot locations. You can see from your graph that it's actually detected the breakpoint quite accurately. – Hong Ooi Aug 01 '13 at 15:17
  • It's unclear to me if the knots would really be the same in a more complex dataset. I don't see any reason why it would be the case. – PAC Aug 01 '13 at 15:31