2

I have a data frame df which I split into segments of 10 rows. On each of these I apply a quadratic regression. What I want is to be able to include all the stat_function instances onto the original plot p1. I thought to try p1<-p1 + stat_function(fun=f, color='blue'). The code below works without error but the resulting plot only includes one blue quadratic. How can I achieve the desired plot?

library(ggplot2)

g <- function(x){x*(x-1)*(x-2)}

x<-rnorm(500)
error <- rnorm(500)
y<-g(x)+error
df<-data.frame(x=x,y=y)

p1<-ggplot(df, aes(x=x, y=y)) +
  geom_point(color='red')
p1

for (i in 0:10){
  df2<-df[(1+10*i):(10+10*i),]
  m<-lm(y~poly(x,2,raw=TRUE),df2)
  b<-m$coefficients
  f<-function (x) {b[1]+b[2]*x+b[3]*x^2}
  p1<-p1 + stat_function(fun=f, color='blue')
}
p1
Geoff
  • 925
  • 4
  • 14
  • 36
  • You can add a list to a ggplot object. This might be helpful: https://stackoverflow.com/a/55627647/8583393 – markus Apr 07 '20 at 14:49

2 Answers2

3

Your stat function does not work because you pass it directly to the ggplot object p1 without specifying the specific dataframe, so the function f is applied onto the dataset inherited in p1, which is the whole dataset df.

Not very sure what is your intended plot / output, but from the code, I can see you are fitting the line and a small segment of the data and plotting it onto original plot. This is equivalent to grouping the data and doing a geom_smooth, and if you want it, extend it over the whole plot (fullrange=TRUE option).

Maybe try something like this:

p1<-ggplot(df, aes(x=x, y=y)) +
  geom_point(color='red')
p1

line_data = df[1:110,]
line_data$group = (1:nrow(line_data) -1) %/% 10

p1 + geom_smooth(data=line_data,aes(group=group),
method="lm",formula=y~poly(x,2,raw=TRUE),
fullrange=TRUE,se=FALSE,size=0.2,alpha=0.3)

enter image description here

StupidWolf
  • 45,075
  • 17
  • 40
  • 72
1

Create a new data frame to plot the curve.

library(ggplot2)

g <- function(x){x*(x-1)*(x-2)}

x<-rnorm(500)
error <- rnorm(500)
y<-g(x)+error
df<-data.frame(x=x,y=y)

p1<-ggplot(df, aes(x=x, y=y)) +
  geom_point(color='red')
p1

df3<- data.frame('x' = seq(min(df$x), max(df$x), 0.01))

for (i in 0:10){
  df2<-df[(1+10*i):(10+10*i),]
  m<-lm(y~poly(x,2,raw=TRUE),df2)
  b<-m$coefficients
  f<- function (x) {b[1]+b[2]*x+b[3]*x^2}
  df3$predict<- f(df3$x)
  p1<- p1 + geom_line(data = df3, aes(x = x, y = predict), col = 'blue')
}
p1

Edited: to create a curve across the entire of the x-axis.

Kozolovska
  • 1,090
  • 6
  • 14