I am working with the R programming language.
I generated some random data and added a polynomial regression line to the data:
# PLOT 1
library(ggplot2)
x = rnorm(15, 2,2)
y = rnorm(15,7,2)
df = data.frame(x,y)
p <-ggplot(df, aes(x, y))
p <- p + geom_point(alpha=2/10, shape=21, fill="blue", colour="black", size=5)
#Add a loess smoother
p + stat_smooth(method="lm", se=TRUE, fill=NA, formula=y ~ poly(x, 6, raw=TRUE),colour="red") + ggtitle("Original Data: Polynomial Regression Model")
Now, I want to add a single outlier to this data, re-fit the polynomial regression and plot the data:
# PLOT 2
x = rnorm(1,13,1)
y = rnorm(1, 13,1)
df_1 = data.frame(x,y)
df = rbind(df, df_1)
p <-ggplot(df, aes(x, y))
p <- p + geom_point(alpha=2/10, shape=21, fill="blue", colour="black", size=5)
#Add a loess smoother
p + stat_smooth(method="lm", se=TRUE, fill=NA,
formula=y ~ poly(x, 6, raw=TRUE),colour="red") + ggtitle("Modified Data: Polynomial Regression Model")
My Problem: The problem is, now the axis has become so big that the data looks like a "flat line":
I tried to fix this by limiting the size of the axis:
p + stat_smooth(method="lm", se=TRUE, fill=NA, formula=y ~ poly(x, 6, raw=TRUE),colour="red") + ggtitle("Modified Data: Polynomial Regression Model")+ scale_y_continuous(limits = c(min(df$y),max(df$y)))
But I now get the following warning message:
Warning message:
Removed 35 rows containing missing values (geom_smooth).
My Question: Why are rows being deleted when I try to fix the axis? Is there a better way to correct this problem?
Thanks!