1

This question was answered here, but not answered in R language. I am a relatively new to coding, so I haven't been able to figure out how to 'translate' the accepted answer's C++ code into R code.

As in the linked question, my line segment is defined by two endpoints: A (x1,y1) and B (x2,y2). I'm trying to find the shortest (i.e., perpendicular) distance between this line segment and a point C (x3,y3). Below is some example code illustrating that point "C" should have a shortest distance to the line segment, but point "C1" should not.

A <- c(2, 4)
B <- c(8, 16)
C <- c(3, 11)
C1<- c(11, 16)

plot(1, type = "n", xlim = c(0, 25), ylim = c(0, 25))
points(C[1], C[2], col = "red")
points(C1[1], C1[2], col = "blue")
points(A[1], A[2])
points(B[1], B[2])
segments(A[1], A[2], B[1], B[2])

Thanks in advance for any help!

I know I have to remove the semi-colons and class types to 'translate ' into R, but I think one of the main problems will be figuring out comparable class types to the vec2 types used in the C++ code.

Passer By
  • 19,325
  • 6
  • 49
  • 96
LGH14
  • 21
  • 2

2 Answers2

1

Are you looking for something like this?

A <- c(2, 4)
B <- c(8, 16)
C <- c(3, 11)
C1<- c(11, 16)

library(sf)
library(sfheaders)

myline <- sf_linestring(matrix(rbind(A,B), ncol = 2))
mypoint <- sf_point(matrix(rbind(C,C1), ncol = 2))

st_length(st_nearest_points(mypoint, myline))
#[1] 0.2425356 7.0000000

visual confirmation

library(ggplot2)
ggplot() + 
  geom_sf(data = myline, color = "green") +
  geom_sf(data = mypoint, color = "blue") +
  geom_sf(data = st_nearest_points(mypoint, myline), color = "magenta")

enter image description here

Wimpel
  • 26,031
  • 1
  • 20
  • 37
  • 2
    I got the impression the question was about how to write this kind of logic in R, as well as how to do the task. But yes this is probably faster and is definitely how in practice I would do this. – SamR Mar 02 '23 at 19:28
  • I was asking both how write this kind of logic in R, as well as how to do the task. If this solution is faster, I will use this one because I am running a simulation with many iterations. – LGH14 Mar 02 '23 at 21:12
  • 1
    @LGH14 I don't know if it's faster - depends what `sf` is doing under the hood - you'd have to benchmark them. – SamR Mar 02 '23 at 21:13
  • 2
    This solution needs to use 'rbind' instead of 'c' to make the coordinates correct. I submitted an edit. – LGH14 Mar 02 '23 at 21:20
  • @LGH14 as you're concerned about speed - I've updated my answer slightly. But if speed is really important you should just use the C++ answer for your simulations - it will be a lot faster. Nice fix with `rbind()` vs `c()` - I've voted to approve your edit. – SamR Mar 02 '23 at 21:39
  • 1
    edit was rejected (don't know why). but I edited the answer. thanks. As far as speed is concerned: I used the approach above on 9spatial) datates of >1M objects, and was not annoyed by the time taken in the analysis. – Wimpel Mar 03 '23 at 07:36
  • How would you use this approach if you had a data frame of points? E.g. if each row was six columns: the x,y of the line and the x,y of the start and end of the segment? I rewrote the JavaScript function in Rust, and complied it with `rextender` to a function you can call from R. It's unsurprisingly faster than the pure R version. I'd like to benchmark against this but unless they're all vectorized the bottleneck will be looping over rows rather than the operations so it won't be a meaningful comparison. – SamR Mar 03 '23 at 08:33
  • Split the two objects, one points, one lines, and then use something like `sf::st_nearest_points()` with `pairwise = TRUE` – Wimpel Mar 04 '23 at 15:24
1

This is mostly a port of the javascript answer, though I've made it a bit more idiomatic for R. In particular, it uses vectors as arguments like the C++ answer (vec2 just means a 2d vector), rather than supplying individual x and y points. To that end, I've defined your points using x and y values:

A <- c(x = 2, y = 4)
B <- c(x = 8, y = 16)
C <- c(x = 3, y = 11)
C1 <- c(x = 11, y = 16)

You can call it like this:

pDistance(start_point = C, segment_p1 = A, segment_p2 = B) 
# Closest point is on segment
# [1] 2.236068
pDistance(start_point = C1, segment_p1 = A, segment_p2 = B) 
# Closest point is: B
# [1] 3

We can plot a circle with a radius of the output from each point to check the answer:

enter image description here

pDistance <- function(start_point, segment_p1, segment_p2) {
    dist_p1 <- start_point - segment_p1
    dist_p2 <- segment_p2 - segment_p1

    args <- as.list(match.call())

    dot <- sum(dist_p1 * dist_p2)
    len_sq <- sum(dist_p2^2)

    param <- -1
    if (len_sq != 0) { # in case of 0 length line
        param <- dot / len_sq
    }

    if (param < 0) {
        message("Closest point is: ", args$segment_p1)
        xx <- segment_p1["x"]
        yy <- segment_p1["y"]
    } else if (param > 1) {
        message("Closest point is: ", args$segment_p2)
        xx <- segment_p2["x"]
        yy <- segment_p2["y"]
    } else {
        message("Closest point is on segment")
        xx <- segment_p1["x"] + param * dist_p2["x"]
        yy <- segment_p1["y"] + param * dist_p2["y"]
    }

    dx <- unname(start_point["x"] - xx)
    dy <- unname(start_point["y"] - yy)
    sqrt(dx * dx + dy * dy)
}

I used plotrix to add the circles to your plot:

plotrix::draw.circle(C["x"], C["y"], pDistance(C, A, B))
plotrix::draw.circle(C1["x"], C1["y"], pDistance(C1, A, B))

The actual function does not use any additional packages.

SamR
  • 8,826
  • 3
  • 11
  • 33