I am really new to use R. So I am having a problem to visualize data using ggplot2 package in R. I would like to create a linear regression graph in which the points within the specific area have the same color and the points outside that area have the same color. Also, I would like to change the background within the specific area to focus on that area as well.
The graph I would like to make will be similar like the below graph.
But until now, I only could create the below simple graph.
My code to generate the current graph is below.
g <- ggplot(df, aes(x = real, y = predicted))
g + geom_point() +
geom_abline(intercept = 0, slope = 1, color='black') +
theme_classic() +
geom_abline(intercept = 0+s_est, slope = 1, color = 'darkgrey')+
geom_abline(intercept = 0-s_est, slope = 1, color = 'darkgrey') +
ggtitle("Test Set")
The first 100 lines of data are as follows.
structure(list(real = c(3.33, 5.92, 5.3, 6, 6.96, 7.03, 6.6,
7.92, 8.3, 10.52, 6.34, 4.38, 4.59, 9.8, 10.3, 10, 8.25, 6, 7.44,
6.66, 9.09, 9.22, 9.7, 4.82, 6.1, 4.92, 4.29, 3.22, 6.01, 9.05,
9.04, 4.85, 8.22, 6.7, 6.7, 4.62, 4.82, 8.52, 5.24, 8.15, 7,
10, 7, 5.18, 5.93, 8.4, 7.7, 7.24, 9.54, 6.06, 8, 4.35, 4.2,
4.51, 2.48, 9.1, 5.34, 4.19, 8.05, 8.55, 6.55, 11.4, 10.96, 9.64,
4.49, 6, 6.9, 6.17, 9, 6.92, 3.77, 4.22, 8.92, 7.55, 7.6, 6.82,
5.32, 8.39, 5.09, 10.96, 6.68, 9.4, 5.04, 5.59, 9.21, 9.7, 6.98,
6.17, 8.89, 9.74, 6.08, 6.7, 4.41, 3.57, 7.12, 6.09, 6.11, 6.82,
7.3, 6.77), predicted = c(3.3049898147583, 7.57794666290283,
5.81329345703125, 3.71067190170288, 6.35026741027832, 6.59200620651245,
6.32752990722656, 7.13449430465698, 7.78791570663452, 8.61589622497559,
7.72269868850708, 5.33322525024414, 7.26069974899292, 9.23727989196777,
8.27904891967773, 7.55226612091064, 5.94742393493652, 4.07633399963379,
7.67468595504761, 5.64575576782227, 7.85368394851685, 7.73117685317993,
10.2843132019043, 4.96891403198242, 6.29262351989746, 6.03091764450073,
6.71697568893433, 3.50744342803955, 6.46608829498291, 8.20327758789062,
7.52885150909424, 4.58155632019043, 6.1530909538269, 6.49482202529907,
5.28225088119507, 4.44094896316528, 5.503089427948, 7.79408073425293,
5.6220269203186, 7.12402009963989, 6.30716276168823, 7.15596580505371,
7.26271867752075, 5.41359615325928, 5.68268489837646, 6.81329536437988,
7.10254955291748, 8.64251136779785, 8.65674114227295, 5.94885206222534,
9.24687099456787, 5.93400239944458, 5.66134691238403, 6.14793062210083,
2.94440221786499, 9.21078777313232, 5.96825170516968, 4.69157028198242,
7.91313886642456, 6.90836668014526, 6.72082805633545, 9.95611953735352,
9.15732383728027, 6.68948268890381, 3.60811305046082, 7.42742109298706,
6.05647945404053, 6.2350025177002, 8.12950134277344, 7.56590843200684,
5.3975772857666, 3.48417925834656, 7.63604927062988, 8.04048824310303,
7.78053188323975, 7.34217929840088, 7.93345308303833, 8.03125,
5.62498426437378, 4.80621385574341, 5.19631958007812, 7.51661252975464,
5.43919944763184, 5.5195426940918, 6.10152912139893, 8.25357818603516,
5.73111486434937, 7.27180528640747, 8.37008285522461, 7.78157567977905,
7.52273559570312, 4.32158374786377, 6.20211696624756, 4.30103015899658,
7.89811611175537, 6.88143062591553, 6.74230575561523, 6.75651741027832,
6.64747190475464, 6.72232007980347)), class = c("tbl_df", "tbl",
"data.frame"), row.names = c(NA, -100L))
s_est = 4.536
Thank you so much for any help.