I'm looking for the most efficient way to identify/extract data points that fall outside the CI shade in a correlation plot like this one:
ggplot(df,aes(x,y))+geom_point()+
stat_smooth(method = "lm", formula = y~poly(x, 2), size = 1, se = T, level = 0.99)
I would like to be able to save a new variable which marks the data points that fall outside as follows:
x y group
1: 0.0 0.00 1
2: 0.5 0.40 1
3: 0.9 0.70 1
4: 1.0 1.30 1
5: 2.0 6.60 0
6: 3.0 3.10 1
7: 4.0 4.40 1
8: 5.0 5.90 1
9: 6.0 6.05 1
10: 7.0 7.60 1
11: 8.0 8.00 1
12: 9.0 2.90 0
13: 10.0 13.80 1
14: 11.0 13.40 1
15: 12.0 14.90 1
Original Data:
df <- data.table("x"=c(0,0.5,0.9,1,2,3,4,5,6,7,8,9,10,11,12),
"y"=c(0,0.4,0.7,1.3,6.6,3.1,4.4,5.9,6.05,7.6,8,2.9,13.8,13.4,14.9))
Desired Data:
df2 <- data.table("x"=c(0,0.5,0.9,1,2,3,4,5,6,7,8,9,10,11,12),
"y"=c(0,0.4,0.7,1.3,6.6,3.1,4.4,5.9,6.05,7.6,8,2.9,13.8,13.4,14.9),
"group" = c(1,1,1,1,0,1,1,1,1,1,1,0,1,1,1))