4

I'm working with two dataframes in R: a "red" dataframe and a "black" dataframe. In both there are two columns representing the coordinates.
I used a plot to explain what I want to do.

I would like to select all the points from the "red" dataframe that are beyond the "black" line. E.g. all the points excluded from the area of the polygon delimited by the black points.

enter image description here

Giordano
  • 81
  • 5

5 Answers5

1

You could use the sf package to define a convex hull and intersect your target points with that polygon.

Define a convex hull based on black:

library(sf)

set.seed(99)
red <- data.frame(x = runif(100,-10,10), y = runif(100,-4,4))
black <- data.frame(x = runif(100,-8,8), y = runif(100,-4,3))

# Convert df to point feature
blk <- st_as_sf(black, coords = c("x", "y"))
# Convert to multipoint
blk_mp <- st_combine(blk)
# Define convex hull
blk_poly <- st_convex_hull(blk_mp)

plot(black)
points(red, col = "red")
plot(blk_poly, add = TRUE)

enter image description here

Intersecting red with the convex hull returns red within that polygon:

rd <- st_as_sf(red, coords = c("x", "y"))
rd_inside <- st_intersection(rd, blk_poly)

plot(black)
points(red)
plot(blk_poly, add = TRUE)
plot(rd_inside, pch = 24, col = "red", bg = "red", add = TRUE)

enter image description here

Skaqqs
  • 4,010
  • 1
  • 7
  • 21
  • 1
    The "black line" corresponds to a concave polygon. –  May 25 '22 at 12:30
  • I think using the convex hull suggested by @Skaqqs could be the right way, the problem is that I need to draw a line that traces the contour as precisely as possible. Indeed a convex hull doesn't work for me because I will include in the convex all the red points in the diagonal that would be formed in the upper left-hand corner of the plot, not being able to exclude them and retrieve them – Giordano May 25 '22 at 12:58
  • @Giordano, Can you share the data you used to make the plot in your question? If you points are in order, we could easily convert them to a polygon. – Skaqqs May 27 '22 at 15:37
1

One possible solution is to draw the polygon after the points and fill its outer area white. This cannot be done directly with polygon or polypath, because these functions can only fill the interiour of a polygon. You can however fill the area between two polygons with polypath. Thus you can add a second polygon that encompasses (or goes beyond) the borders of your plot.

Here is an example that works in base R:

p.outer <- list(x=c(0,100,100,0), y=c(0,0,100,100))
p.inner <- list(x=c(20,40,80,50,40,30), y=c(30,20,70,80,50,60))
plot(p.outer, type="n")
points(runif(100, min=0, max=100), runif(100, min=0, max=100))
polypath(x=c(p.outer$x, NA, p.inner$x), y = c(p.outer$y, NA, p.inner$y), col ="white", rule="evenodd")

enter image description here

cdalitz
  • 1,019
  • 1
  • 6
  • 7
1

My previous answer only showed how not to draw the points outside the polygon. To actually identify the points outside the polygon, you can use the function pip2d from the package ptinpoly. It returns negative values for points outside the polygon.

Example:

library(ptinpoly)
poly.vertices <- data.frame(x=c(20,40,80,50,40,30), y=c(30,20,70,80,50,60))
p <- data.frame(x=runif(100, min=0, max=100), y=runif(100, min=0, max=100))

outside <- (pip2d(as.matrix(poly.vertices), as.matrix(p)) < 0)

plot(p$x, p$y, col=ifelse(outside, "red", "black"))
polygon(poly.vertices$x, poly.vertices$y, border="blue", col=NA)

The same should be achieved with the function PtInPoly from the package DescTools, which returns zero for points outside the polygon. The implementation of ptinpoly, however, has the advantage of implmenting the particularly efficient algorithm described in

J. Liu, Y.Q. Chen, J.M. Maisog, G. Luta: "A new point containment test algorithm based on preprocessing and determining triangles." Computer-Aided Design, Volume 42, Issue 12, December 2010, Pages 1143-1150

Edit: Out of curiosity, I have compared the runtime of ptinpoly::pip2d and DescTools::PtInPoly with microbenchmark and N=50000 points, and pip2d turned out to be considerably faster:

> microbenchmark(outside.pip2d(), outside.PtInPoly())
Unit: milliseconds
              expr       min        lq      mean   median        uq      max
   outside.pip2d()  3.375084  3.421631  4.459051  3.48939  4.251395 65.97793
outside.PtInPoly() 27.537927 27.666688 28.739288 27.97984 28.514595 90.11313

neval 100 100

enter image description here

cdalitz
  • 1,019
  • 1
  • 6
  • 7
  • Very nice solution! I tried but it seems that is quite computationally challenging. I think I need a solution based on dataframe joining or filtering – Giordano May 30 '22 at 08:13
  • 1
    Have you tried both `DescTools::PtInPoly` and `ptinpoly::pip2d` and both are too slow? As a preprocessig step, you can remove all points that are obviously outside (all points outside the outer bounding box of the polygon) and obviously inside (all points inside the largest rectangle inside the polygon), but this might still leave over too many points. – cdalitz May 30 '22 at 16:09
1

Using the function PtInPoly of package DescTools, as suggested by @cdalitz, I resolved the problem. This function returned a data frame of the coordinates of the points (in my case the red coordinates) and a third column "pip" of 1 (if the point is within the polygon) and 0s (if outside the polygon).

I will use another dataset to show you the result:

 try <- DescTools::PtInPoly(pnts = red[,c("x","y")], poly.pnts = black[,c("x","y")])

 ggplot()+
      geom_point(try, mapping = aes(x = x, y = y, color = as.character(pip))) +
      geom_polygon(data = black, mapping = aes(x,y))

enter image description here

Giordano
  • 81
  • 5
  • As you commented above that `ptinpoly::pip2d` was too slow for your needs, I wonder why `DescTools::PtInPoly` was faster. In my tests, it was the other way around. – cdalitz May 31 '22 at 14:38
  • I don't know actually. Using ptinpoly my R crashed. I thought it was for computational effort – Giordano Jun 06 '22 at 07:42
0

Seems that you can reconstruct the edges of the black polygon by simply joining every point to its nearest neighbor and its nearest neighbor in the opposite direction. Then perform point-in-polygon tests.