4

Not to be confused with this post.

And very similar to this post.

This is a very simple concept.

I have a set of points, and I want to calculate the distance of each point normal to a line.

In this repoducible example, I load the sp::meuse data and draw a line. How do I add a column to meuse equal to the distance of each point normal (at a right angle) to the line?

# load data
library(sp)
data(meuse)

# create new line
newline = data.frame(y = seq(from = 330000, to = 334000, length.out = 100), x = seq(from = 178000, to = 181000, length.out = 100))
# plot the data
meuse %>% 
  ggplot(aes(x, y)) + geom_point() + 
  ggtitle("Zinc Concentration (ppm)") + coord_equal() +
  geom_line(data = newline, aes(x,y))

To illustrate:

enter image description here

Rich Pauloo
  • 7,734
  • 4
  • 37
  • 69
  • Possible duplicate of [Using R, how to calculate the distance from one point to a line?](https://stackoverflow.com/questions/35194048/using-r-how-to-calculate-the-distance-from-one-point-to-a-line) – Chris Holbrook Sep 07 '17 at 23:36
  • @ChrisHolbrook, this question is different in 2 ways. I'm looking for distance of a **set** of points, **normal** to a line. – Rich Pauloo Sep 07 '17 at 23:38

2 Answers2

4

As you are using the meuse data, it seems natural to use spatial objects and spatial computations :

library(sf)
library(sp)
data(meuse)

# create new line - please note the small changes relative to your code : 
# x = first column and cbind to have a matrix instead of a data.frame
newline = cbind(x = seq(from = 178000, to = 181000, length.out = 100),
                y = seq(from = 330000, to = 334000, length.out = 100))

# transform the points and lines into spatial objects
meuse <- st_as_sf(meuse, coords = c("x", "y"))
newline <- st_linestring(newline)

# Compute the distance - works also for non straight lines !
st_distance(meuse, newline) [1:10]
## [1] 291.0 285.2 409.8 548.0 647.6 756.0 510.0 403.8 509.4 684.8


# Ploting to check that your line is were you expect it
plot_sf(meuse)
plot(meuse, add = TRUE)
plot(newline, add = TRUE)

You can convince yourself that these are perpendicular distances relative to your straight line by runing the same code on a line with only 2 coordinates.
Note however that this is the minimum distance to the line. So for the points close the tips of the line segment or outside the range of the straight line, you will not get a perpendicular distance (just the shortest distance to the tip).
You must have a line long enough to avoid this...

newline = cbind(x = c(178000, 181000), 
                y = c(330000, 334000))

# transform the points and lines into spatial objects
meuse <- st_as_sf(meuse, coords = c("x", "y"), crs = 31370)
newline <- st_linestring(newline)

# Compute the distance - works also for non straight lines !
st_distance(meuse, newline) [1:10]
## [1] 291.0 285.2 409.8 548.0 647.6 756.0 510.0 403.8 509.4 684.8


# Ploting to check that your line is were you expect it
plot_sf(meuse)
plot(meuse, add = TRUE)
plot(newline, add = TRUE)

If the slope of the line was 1 you could compute the projection of the points on the line using Pytagoras (difference in x = difference in y = distance to the line / sqrt(2)).

This does not work here (the red segments are not perpendicular to the line) because the slope is not 1 and hence the difference in y coordinates is not equal to the difference in x coordinates. Pytagoras equation is not solvable here. (but the distances computed by st_distance are perpendicular to the line.)

distances <- st_distance(meuse, newline)
newline = data.frame(x = c(178000, 181000), 
                     y = c(330000, 334000))

segments <- as.data.frame(st_coordinates(meuse))
segments <- data.frame(
    segments, 
    X2 = segments$X - distances/sqrt(2), 
    Y2 = segments$Y + distances/sqrt(2)
)    

library(ggplot2)
ggplot() +
    geom_point(data = segments, aes(X,Y)) +
    geom_line(data = newline, aes(x,y)) +
    geom_segment(data = segments, aes(x = X, y = Y, xend = X2, yend = Y2), 
                 color = "orangered", alpha = 0.5) +
    coord_equal() + xlim(c(177000, 182000))

enter image description here

If you want just to plot it, you can use rgeos gProject function (no equivalent in sf for the moment) to get the coordinates of the projection of the points on the line. You need the points and lines in ssp format instead of sf format and convert between matrix for sp and data.frame for ggplot.

library(sp)
library(rgeos)

newline = cbind(x = c(178000, 181000), 
                y = c(330000, 334000))

spline <- as(st_as_sfc(st_as_text(st_linestring(newline))), "Spatial") # there is probably a more straighforward solution...
position <- gProject(spline, as(meuse, "Spatial"))
position <-  coordinates(gInterpolate(spline, position))
colnames(position) <- c("X2", "Y2")

segments <- data.frame(st_coordinates(meuse), position)

library(ggplot2)
ggplot() +
    geom_point(data = segments, aes(X,Y)) +
    geom_line(data = as.data.frame(newline), aes(x,y)) +
    geom_segment(data = segments, aes(x = X, y = Y, xend = X2, yend = Y2), 
                 color = "orangered", alpha = 0.5) +
    coord_equal() + xlim(c(177000, 182000))

enter image description here

Gilles San Martin
  • 4,224
  • 1
  • 18
  • 31
  • This seems like a simple, promising and scalable solution. However, this code isn't working for me. I get the error: `> st_distance(meuse, newline) [1:10]` **Error: st_crs(x) == st_crs(y) is not TRUE** – Rich Pauloo Sep 08 '17 at 00:18
  • Sorry, remove the crs argument in `st_as_sf(meuse, coords = c("x", "y"))`. See the edited reply – Gilles San Martin Sep 08 '17 at 00:21
  • I already tried that, and unfortunately it will returns the same error. Could it be that newline isn't a spatial object? That is, str(meuse) shows that meuse is SPDF, and str(newlines) shows it as a numeric vector of 2 columns, with x and y attributes only. Thanks for your help by the way! :) – Rich Pauloo Sep 08 '17 at 00:23
  • I think you have to restart your R session for a fresh start. – Gilles San Martin Sep 08 '17 at 00:27
  • hahaha funny you should say that. I was thinking the same thing, and now I'm getting some distances! I'll take a look. Thanks for troubleshooting some spatial data with me. It's good too, because the example is for a larger set of spatialpoints! – Rich Pauloo Sep 08 '17 at 00:33
  • I want to accept this answer, but before I do, can you confirm that the vector of distances are normal (orthogonal) to the line? I'm working on this visualization too. I think they can be drawn with `lines`..... – Rich Pauloo Sep 08 '17 at 00:36
1

In addition to @Giles answer, I want to share another approach I came up with just in case someone else who has the same problem reads this.


The following 2 functions take a line defined by a sequence of points, and another set of points, and calculates the minimum distance between each of the pts and the line, as discussed above.


1. Pythagorean Theorem: for calculating the Euclidian straight line distance (on a sphere, like Earth, this line goes through the earth). Not so correct, unless this is a 2D plane.

# data for the function
line <- data.frame(x = seq(from = 178000, to = 181000, length.out = nrow(meuse)),
                   y = seq(from = 330000, to = 334000, length.out = nrow(meuse)))
data(meuse)
pts <- meuse[,1:2] # the pts must **only** be a set of x,y coords.

# takes a line defined by a set of points along a line, and a set of points, and
# returns the minimum (orthogonal) Euclidian distance between all pts and
# the line.
euclid_min_d <- function(line, pts){
  d <- vector(mode = "integer", length = nrow(pts))
  for(i in 1:nrow(pts)){
    d[i] = min(( abs(abs(pts$x[i]) - abs(line$x)) + abs(abs(pts$y[i]) - abs(line$y)) )^(.5))
  }
  return(d)
}

# run the function

> euclid_min_d(line, pts)
  [1] 19.100214 18.884131 22.751052 26.279714 28.542290 30.792561 25.281532 22.542312 25.311822 29.261029
 [11] 29.244158 27.581520 19.304464 23.133997 25.341819 19.891916 20.933723 21.650710 22.187513 23.017780
 [21] 24.815684 25.861519 26.623883 31.012356 32.121523 32.921197 30.427047 31.005236 29.177246 33.856458
 [31] 34.725454 31.817866 31.557616 34.042563 34.963060 31.055041 28.352833 25.067182 23.438231 23.029907
 [41] 27.357185 30.330645 28.661028 29.866152 25.177809 26.005244 27.847684 29.808262 31.489227 31.789895
 [51] 31.186702 21.838248 19.593930 16.820674 12.939921  8.453079 16.431282 17.929516 21.105486  8.029976
 [61]  8.775704 13.154921 16.281613 17.222758 17.667933 17.583567 25.114026 31.508193 35.387833 20.621827
 [71] 21.949470 20.537264 21.489955 23.383866 23.437953 24.926646 26.035939 17.687034 18.105391 17.051869
 [81] 18.006853 43.461582 18.294844 26.630712 22.386800 22.206820 19.229644 19.982785 22.110408 23.343288
 [91] 27.182643 17.403911 16.936858 36.972085 34.168034 31.318639 29.846360 26.406414 25.530730 30.197618
[101] 35.733501 37.126320 37.440689 34.361334 33.550698 36.716888 38.238826 39.461143 38.888285 34.671183
[111] 32.966138 28.270263 28.243376 25.291033 19.522880 18.844201 18.683280 45.283953 28.036386 27.277554
[121] 18.934611 13.435271 15.683145 23.364975 23.064562 31.654589 33.422133 32.795906 22.917696 23.211618
[131] 36.513118 27.062218 27.744895 34.663690 30.394377 29.861151 34.564827 26.293548 24.816469 27.611404
[141] 34.684103 37.463055 40.124805 40.472213 38.301436 39.127995 35.930488 31.048349 35.284558 31.208557
[151] 32.370020 29.511389 25.336181 34.386081 49.889358

2. sp::spDist or sp::spDistN1: Haversine (Great Circle) distance over the curved surface of a sphere. More correct for spatial long/lat data, especially points over a large region.

# Haversine (Great Circle distance) for spatial points
library(sp)

# bring in data
data(meuse)

# define a line by a sequence of points between two points at the ends of a line. 
# length can be any number. as length approaches infinity, the distance between 
# points and the line appraoches a normal (orthogonal) orientation

line <- data.frame(x = seq(from = 178000, to = 181000, length.out = nrow(meuse)),
                       y = seq(from = 330000, to = 334000, length.out = nrow(meuse)))

# columns 1 and 2 are long/lat, column 3 is extra data to make the SPDF
pts <- meuse[,1:3] 

# sp::spDists only takes spatial objects, so make line and pts 'spatial objects' 
coordinates(line) <- c("x","y")
coordinates(pts) <- c("x","y")
class(line) # spdf
class(pts) # spdf

# takes a line defined by a set of points along a line, and a set of points, and
# returns the minimum (orthogonal) Great circle distance between all pts and
# the line.

gc_min_d <- function(line, pts){
  d <- vector(mode = "integer", length = nrow(pts))
  for(i in 1:nrow(pts@coords)){
    d[i] = min(spDistsN1(line, pts@coords[i,]))
  }
  return(d)
}

# run the function

> gc_min_d(line, pts)
  [1]  291.11720  285.53973  409.96584  548.04129  647.62204  756.00049  510.23214  403.85051  509.42670
 [10]  684.83496  683.85902  607.07043  296.06328  424.00002  512.42695  316.53043  348.31645  372.00033
 [19]  391.95541  419.80867  489.01900  533.00685  564.94128  767.47913  823.85186  866.80035  740.20012
 [28]  766.21521  681.05998  914.12500  964.42709  809.15362  793.28237  923.85091  974.03591  770.16568
 [37]  638.61108  501.28887  435.72032  420.80271  597.42032  733.09464  655.81036  709.53065  504.62940
 [46]  539.63972  618.02925  709.00052  791.80007  804.61682  776.86267  379.10017  304.56883  225.40137
 [55]  131.15538   54.74886  212.76568  253.89445  352.36923   49.33170   59.28282  134.22712  211.55102
 [64]  234.80732  247.80522  245.84412  501.01377  793.84395 1001.00078  339.25067  382.25984  333.60141
 [73]  369.72499  433.24388  438.60154  494.40929  540.41364  248.70466  262.20469  230.42463  258.05366
 [82] 1509.07023  264.01918  566.03235  396.98188  393.42160  294.08301  317.21856  390.41809  433.23431
 [91]  586.86838  240.47441  226.83961 1090.66783  933.51589  783.80282  710.00866  556.40688  518.09206
[100]  726.00636 1019.61784 1101.62687 1120.26196  942.68793  897.40984 1077.08258 1167.68956 1243.01891
[109] 1209.40946  957.62977  866.73835  635.60151  637.89236  509.05787  303.69985  280.34209  277.20013
[118] 1637.20259  627.43972  591.21355  286.64675  141.47774  194.00723  434.05177  425.10718  801.54641
[127]  889.25788  857.64172  420.13830  430.81025 1064.83266  584.79094  614.02411  957.26008  738.60392
[136]  712.45532  951.40992  549.20351  490.60506  608.26798  961.20035 1121.64783 1276.01133 1271.80112
[145] 1147.71273 1167.60070  979.46392  735.61392  974.03783  774.80205  838.08370  692.84592  513.42862
[154]  944.32835 1987.61385
Rich Pauloo
  • 7,734
  • 4
  • 37
  • 69
  • Didn't you forget to square the difference of coordinates? i.e. I would wrote in the first case `d[i] = min(( abs(abs(pts$x[i]) - abs(line$x))^2 + abs(abs(pts$y[i]) - abs(line$y))^2 )^(.5))` – atsyplenkov Sep 23 '20 at 14:11