2

I want to fit a piecewise linear function to my data:

  • unknown breakpoint
  • first line has intercept 0
  • second line has slope 0 (unknown what the 'cap' value is though)

Following this answer I can obtain the below fit where the second line has slope=0, but how do I also constrain the intercept of the first line to be 0?

  • y = 4.1593 * x + -1.3469 for x < 9.215
  • y = 36.98 for x > 9.25
set.seed(1)
x <- rnorm(100, mean=9, sd=1.5)
y <- pmin(x * 4, 37) + rnorm(100)
plot(x, y)

enter image description here

# refine to put in a breakpoint allowing the other part to have non-zero slope
# with this setup, segmented will constrain the slope of the first segment to be 0
# since I want the second segment to be 0 I mirror around the X axis
negx <- -x
mpw <- segmented(m0, seg.Z=~negx)
# Estimated Break-Point(s):
# psi1.negx  
#    -9.215

slope(mpw)
# Est. St.Err. t value CI(95%).l CI(95%).u
# slope1  0.0000      NA      NA        NA        NA
# slope2 -4.1593 0.16289 -25.534   -4.4826    -3.836

intercept(mpw)
# intercept1 36.9800
# intercept2 -1.3469

I have tried:

  • using seg.Z ~ negx - 1 in the segmented call: this returns a model that makes no sense to me (slope of -17 when transformed back to the original coordinate system, ie wrong sign entirely)
  • fitting the no-intercept part first then attempting to constrain the second part to be constant: m1 <- lm(steam ~ x - 1) and segmented(m1, seg.Z=~1, psi=1), but it complains that the names of psi don't match the variables in the model (I do not know how to name the intercept "variable")
mathematical.coffee
  • 55,977
  • 11
  • 154
  • 194
  • I extracted data from the scatterplot for analysis, and although I did not use R I have the following values for comparison: fitted breakpoint = 8.95083201, first line slope = 3.98713697, second line offset = 37.01122806 with an overall RMSE = 0.9728 and R-squared = 0.9229 for the model as a whole. These "extracted data" results should be very close to the results obtained from using the original data. – James Phillips Nov 13 '19 at 03:22
  • @JamesPhillips I am not sure that your comment is helpful. No need to "extract data from the scatterplot" - I have provided code to reproduce the problem data exactly. My question is on how to fit such a model in R. The fact that your results are close to what I used to generate the data is great, but not helpful (the answers are breakpoint=9.25, slope1=4, cap=37 as can be read from the code). – mathematical.coffee Nov 14 '19 at 00:08
  • Your final code should yield very similar fit statistics to the RMSE and R-squared values in my comment, which are based on the fitted parameters that I gave. Indeed if I used R I would not need to extract data from the scatterplot - that is correct. – James Phillips Nov 14 '19 at 02:25

0 Answers0