I don't know a good built-in way to do this, and as Ben Bolker and others noted, this is not a straightforward question to answer in a robust, generalizable way. That said, I had some success with this specific question using a brute force approach. Since I'm more comfortable with tidyverse
syntax, I used that, but I'm certain this could be done in a similar fashion in base R.
First, I created a grid of ranges to explore, based on starting x
and the length of the sequence. Adjust the granularity depending on how much computation you want to do. For a quick approach I used every 5 x
and length
s that are multiples of 5. That gave me 1,830 ranges of x
, to which I appended the associated y
's. Then I nested the x
and y
into a new column, data
.
# From OP
p=(-50:50)^2
y=c(p, 2500+10*(1:99), p+1000)
library(tidyverse); library(broom)
df1 <- data.frame(x = seq_along(y), y = y+100*rnorm(length(y)))
df1_ranges = crossing(start = seq.int(1, max(df1$x), by = 5),
length = seq.int(5, 300, by = 5)) %>%
mutate(end = start + length - 1) %>%
filter(end <= max(df1$x)) %>% # only keep ranges within the data
uncount(length, .id = "x") %>% # for each x, put in "length" many rows
mutate(x = start + x - 1) %>% # update x to run from "start" to "end"
left_join(df1) %>%
nest(data = c(x, y))
Not I can run lm
regressions on each of those ranges. This takes about 9 seconds on my computer. You could speed it up by looking at fewer distinct ranges, or being cleverer about the search space.
df1_regressions <- df1_ranges %>%
mutate(fit = map(data, ~lm(y~x, data = .x)), # run lm's
glance = map(fit, glance), # summary of fit
tidied = map(fit, tidy)) # extract coefficients
Skipping to the chase, for this example the regions with the best linear fit have the lowest standard error of the regression term. Sure enough, this identifies the right spot, ranging from about 100 to 200.
df1_tidied <- df1_regressions %>%
select(start:end, tidied) %>%
unnest(tidied) %>%
filter(term == "x")
df1_tidied %>%
ggplot(aes(x = start, y = end-start, fill = 1/std.error)) +
geom_tile() +
geom_text(data = . %>% filter(std.error == min(std.error)) %>%
mutate(text = glue::glue("({start}, {end-start})")),
aes(label = text), color = "white", vjust = -0.5) +
scale_fill_viridis_c(direction = -1, option = "C")

Whew! Now that that's out of the way, we could do what you originally asked and see the fitted regression just for that section.
df1_tidied %>%
slice_min(std.error) %>%
select(start,end) %>%
left_join(df1_ranges) %>%
mutate(fit = map(data, ~lm(y~x, data = .x)),
augment = map(fit, augment)) %>%
unnest(augment) -> df1_fitted
ggplot(df1, aes(x,y)) +
geom_point() +
geom_line(data = df1_fitted, aes(y = .fitted), color = "red", size = 2)
