0

I have some land parcel polygons and I have some powerline linestrings. For each parcel, I'm trying to find only the distance to the closest linestring.

This query runs but takes quite a long time:

SELECT 
  a.OBJECTID,
  st_distance(a.geometry, b.geometry) as distance

FROM `kpennell.KYLETEST.parcelpoly` a
join `kpennell.KYLETEST.powerlines` b
ON st_dwithin(a.geometry, b.geometry, 1000) 

I'm guessing I need to use Paul Ramsay Or Michael Entin's nearest neighbors things, but I can't seem to get the query right. The latter example finds just the id of the closest one but not the actual distance (I think).

SELECT 
  a.id, 
  ARRAY_AGG(b.id ORDER BY ST_Distance(a.geog, b.geog) LIMIT 1)
      [ORDINAL(1)] as neighbor_id
FROM people_table a JOIN restaurant_table b
ON ST_DWithin(a.geog, b.geom, 100) -- 100 is search radius
GROUP BY a.id

The Parcels are Polygons with a unique object id parcels

The Powerlines are linestrings with a unique object id enter image description here

What I'm hoping to find, for each parcel, is the parcel, and then the distance to the closest powerline. So for each parcel, I want to get the parcel OBJECTID and then distance to the closest powerline.

I'm trying to do this efficiently where it stops when it finds the nearest neighbor for each one.

Kyle Pennell
  • 5,747
  • 4
  • 52
  • 75

1 Answers1

2

We can combine the distance with the id using a STRUCT:

SELECT 
  poly.id, 
  ARRAY_AGG(STRUCT(pline.id, ST_Distance(poly.geog, pline.geog)) 
    ORDER BY ST_Distance(poly.geog, pline.geog) LIMIT 1
  ) [ORDINAL(1)] as neighbor
FROM 
    `kpennell.KYLETEST.parcelpoly` poly JOIN 
    `kpennell.KYLETEST.powerlines` pline ON 
     ST_DWithin(poly.geog, pline.geom, 100) -- 100 is search radius
GROUP BY poly.id

There are other options though:

If we know the parcelpoly id before the query (from an input form etc.) we can have a simplified query:

SELECT 
  poly.id,
  pline.id,
  st_distance(poly.geometry, pline.geometry) as distance
FROM `kpennell.KYLETEST.parcelpoly` AS poly,
     `kpennell.KYLETEST.powerlines` AS pline
WHERE poly.id = <known_parcelpoly_id> AND st_dwithin(poly.geometry, pline.geometry, 1000)
ORDER BY st_distance(poly.geometry, pline.geometry) 
LIMIT 1

Finally, we can use PostGIS's knn operator (<->):

SELECT 
  poly.id,
  pline.id,
  st_distance(poly.geometry, pline.geometry) as distance
FROM `kpennell.KYLETEST.parcelpoly` AS poly,
     `kpennell.KYLETEST.powerlines` AS pline
WHERE poly.id = <known_parcelpoly_id> AND st_dwithin(poly.geometry, pline.geometry, 1000)
ORDER BY poly.geometry <-> pline.geometry
LIMIT 1

There is a caveat though here!
The <-> operator:

For PostgreSQL below 9.5 only gives centroid distance of bounding boxes and for PostgreSQL 9.5+, does true KNN distance search giving true distance between geometries, and distance sphere for geographies.

which means that for PostgreSQL < 9.5 we may lose accuracy with that method.

John Moutafis
  • 22,254
  • 11
  • 68
  • 112
  • I do appreciate the answer, still not quite getting it to work in BigQuery – Kyle Pennell Nov 10 '21 at 21:19
  • @KylePennell is there an error? Or a different behavior? – John Moutafis Nov 10 '21 at 21:23
  • Some might have a distance of 0 but there's a lot of parcels and some should be more than that – Kyle Pennell Nov 10 '21 at 21:29
  • @KylePennell What srid are you using? Does it use meters or degrees? – John Moutafis Nov 10 '21 at 21:33
  • Should be EPSG:4326 WGS 84 – Kyle Pennell Nov 10 '21 at 21:37
  • @KylePennell `ST_Dwithin` uses the units of the SRID for geometries, so your query is in degrees. Here is an explanation on why this is an issue: https://stackoverflow.com/questions/57137122/geodjango-finding-objects-in-radius/57270114#57270114. That is also true for `ST_Distance` so those `0.0`s there are probably truncated distances in degrees. – John Moutafis Nov 10 '21 at 21:44
  • 1
    I think st_distance returns the shortest distance in meters between two non-empty GEOGRAPHYs. ST_dwithin also work with meters 'The given distance is in meters on the surface of the Earth.' https://cloud.google.com/bigquery/docs/reference/standard-sql/geography_functions#st_distance – Kyle Pennell Nov 10 '21 at 22:11
  • @KylePennell Yeah seems true :/ Have you checked that your data are GEOGRAPHY type and not GEOMETRY (just to make sure)? – John Moutafis Nov 10 '21 at 22:19
  • Yeah, they seem to be geography. Any other ideas? – Kyle Pennell Nov 16 '21 at 17:46