0

from this question and answer, I'm trying to make a picewise linear regression with unknown breakpoint.

The difference, is that I need so constrain the slope on either side of the breakpoint, which I'm not able to do yet.

my data looks like this. I would like an function, so I can map it over multiple nested datasets.

dput(head(my_data, 30 ))
structure(list(vo2 = c(1.967, 3.113, 2.881, 2.931, 2.809, 2.802, 
2.937, 3.235, 3.238, 3.118, 3.177, 2.959, 2.741, 3.157, 2.975, 
2.986, 3.231, 2.448, 2.966, 2.834, 3.559, 3.37, 3.187, 3.079, 
3.076, 2.848, 3.16, 3.285, 3.159, 3.305), vco2 = c(1.552, 2.458, 
2.303, 2.372, 2.264, 2.284, 2.352, 2.566, 2.585, 2.506, 2.6, 
2.441, 2.251, 2.592, 2.418, 2.428, 2.665, 2.039, 2.437, 2.298, 
2.891, 2.733, 2.609, 2.514, 2.538, 2.286, 2.497, 2.59, 2.489, 
2.606)), row.names = c(NA, -30L), class = c("tbl_df", "tbl", 
"data.frame"))

The answer I'm working from is here https://stackoverflow.com/a/15877616/9368078

the code is here

function (x,y) 
{
    f <- function (Cx) 
        {
        lhs <- function(x) ifelse(x < Cx,Cx-x,0)
        rhs <- function(x) ifelse(x < Cx,0,x-Cx)
        fit <- lm(y ~ lhs(x) + rhs(x))
        c(summary(fit)$r.squared, 
            summary(fit)$coef[1], summary(fit)$coef[2],
            summary(fit)$coef[3])
        }

    r2 <- function(x) -(f(x)[1])

    res <- optimize(r2,interval=c(min(x),max(x)))
    res <- c(res$minimum,f(res$minimum))

    best_Cx <- res[1]
    coef1 <- res[3]
    coef2 <- res[4]
    coef3 <- res[5]
    plot(x,y)
    abline(coef1+best_Cx*coef2,-coef2) #lhs  
    abline(coef1-best_Cx*coef3,coef3)  #rs

inspiration solution

I've tried the following alternatives, so I can constrain the slope. Here I try to set boundaries on the coef in optimize.

f <- function (Cx) {
  lhs <- function(x) ifelse(x < Cx,Cx-x,0)
  rhs <- function(x) ifelse(x < Cx,0,x-Cx)
  fit <- lm(y ~ lhs(x) + rhs(x))
  c(summary(fit)$r.squared, 
    summary(fit)$coef[1], summary(fit)$coef[2],
    summary(fit)$coef[3])
}
r2 <- function(x) -(f(x)[1])

res <- optimize(r2,interval=c(min(x),max(x), lower = c(-100,-100,1), upper = c(100,1,100)) )
res <- c(res$minimum,f(res$minimum))

best_Cx <- res[1]
coef1 <- res[3]
coef2 <- res[4]
coef3 <- res[5]
plot(x,y)
abline(coef1+best_Cx*coef2,-coef2) #lhs  
abline(coef1-best_Cx*coef3,coef3)  #rs

Here I try an nls solution. Where I try to set boundaries on the variables.

f <- function (Cx) {
  lhs <- function(x) ifelse(x < Cx,Cx-x,0)
  rhs <- function(x) ifelse(x < Cx,0,x-Cx)
  fit <- nls(y ~ lhs(x) + rhs(x), start = c(0,1.5), 
             algorithm = "port", 
             lower = c(-inf,-inf,1), 
             upper = c(inf,1,inf) )
  c(summary(fit)$r.squared, 
    summary(fit)$coef[1], summary(fit)$coef[2],
    summary(fit)$coef[3])
}
r2 <- function(x) -(f(x)[1])

res <- optimize(r2,interval=c(min(x),max(x)) )
res <- c(res$minimum,f(res$minimum))

best_Cx <- res[1]
coef1 <- res[3]
coef2 <- res[4]
coef3 <- res[5]

But neither solution works for me. So I'm definetly doing something wrong.

I'm into sports science, and trying to learn a bit of statistics.

Can someone point my in a direction regarding a solution?

Jason Aller
  • 3,541
  • 28
  • 38
  • 38
olesendan
  • 15
  • 6
  • Check out CRAN package [segmented](https://cran.r-project.org/web/packages/segmented/index.html). It might save you time to use tested functions than to try to write your own. – Rui Barradas Nov 12 '18 at 12:31
  • @RuiBarradas, thanks for your suggestion. I've been messing around with segmented. But I can not figure out, how to constain the slope in segmented. Do you know if it is possible? – olesendan Nov 12 '18 at 13:38
  • I don't believe so, what you can is to pass an initial estimate for the breakpoint. – Rui Barradas Nov 12 '18 at 14:50

0 Answers0