0

I am currently creating a deck of scatterplots that contain a geom_abline (to display the line of equivalence) and a series of points that can fall directly on that line or some distance away from it. I need to calculate the distance from these points and the line generated by geom_abline. I'm looking for any suggestions/methods on how to go about this.

Here's a picture of one of the plots that I am producing:

enter image description here

Here's the code I've used to get it:

region_matrix<-combined_versions_loc[,c("region_name")]
region_matrix<-unique(region_matrix)
n<-nrow(region_matrix)  
region_matrix<-as.matrix(region_matrix)

pdfname<-"filepath"
pdf(file=pdfname)
for(i in 1:n){
print(ggplot(combined_versions_loc[region_name==region_matrix[i,1]], aes(x=infections_a, y=infections_b)) + geom_point() + geom_abline())
  
}
dev.off()

region_matrix contains a column of the variable region_name- here's a sample of the regions contained inside:

region_name
East Asia
South Asia
Central America

The combined_versions_loc data table contains data that resembles the following:

location_id    region_name         infections_a   infections_b
123            South Asia          1.606049e+06   1.606049e+06
141            South Asia          6.794563e+06   6.698043e+06 
542            South Asia          3.794261e+05   3.782946e+05
689            East Asia           2.795561e+06   2.698823e+06
490            East Asia           1.794563e+04   1.295043e+04
246            East Asia           4.794563e+04   4.668021e+04
708            Central America     7.743551e+02   7.688142e+02
326            Central America     2.994362e+06   2.998143e+06
267            Central America     1.794261e+05   1.694031e+05

Each scatterplot contains the points (infections_a, infections_b) falling inside a location_id within each region_name. Thus each page is one region (i.e. one for East Asia, and one for Central America, etc...).

Konrad Rudolph
  • 530,221
  • 131
  • 937
  • 1,214
bziggy
  • 463
  • 5
  • 19
  • Are you asking about how to retrieve residuals from a ggplot2 stat_smooth? Or rather how to recalc them (i.e. using `lm`)? – dario Oct 18 '21 at 16:46
  • @dario I believe the residuals from stat_smooth would do the trick – bziggy Oct 18 '21 at 16:49
  • 1
    Does this answer your question? [Method to extract stat\_smooth line fit](https://stackoverflow.com/questions/9789871/method-to-extract-stat-smooth-line-fit) – dario Oct 18 '21 at 16:52
  • You could improve your chances of finding help here by adding a [minimal reproducible example](https://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example/5963610#5963610). Adding a MRE and an example of the desired output (in code form, not tables and pictures) makes it much easier for others to find and test an answer to your question. That way you can help others to help you! P.S. Here is [a good overview on how to ask a good question](https://stackoverflow.com/help/how-to-ask) – dario Oct 18 '21 at 16:54
  • @dario this is helpful, but the example you have posted returns 80 points for each plot. I need to calculate the residual for each point that I've plotted. – bziggy Oct 18 '21 at 19:55

0 Answers0