-1

enter image description here

I have shape files including all the coordinates (lat, long) of different polygons, for example the black, red, and green shown here. I would like to calculate the distance that the AB line across different polygons. I am using geopandas and also exploring gdal, but have not figured out any functions would allow this computation or I need implement from scratch. Thanks!

Matthew Borish
  • 3,016
  • 2
  • 13
  • 25
athlonshi
  • 1,711
  • 1
  • 19
  • 23

2 Answers2

2
import shapely
import geopandas as gpd
import geopy.distance

world = gpd.read_file(gpd.datasets.get_path("naturalearth_lowres"))

dims = (4, 3)
a, b, c, d = world.loc[world["name"].eq("Ukraine")].total_bounds
g = 0.4
# generate some polygons and a line that cuts through some of the polygons
gdf_grid = gpd.GeoDataFrame(
    geometry=[
        shapely.geometry.box(minx + g, miny + g, maxx - g, maxy - g)
        for minx, maxx in zip(
            np.linspace(a, c, dims[0]), np.linspace(a, c, dims[0])[1:]
        )
        for miny, maxy in zip(
            np.linspace(b, d, dims[1]), np.linspace(b, d, dims[1])[1:]
        )
    ],
    crs="epsg:4326",
)

gdf_line = gpd.GeoDataFrame(
    geometry=[
        shapely.geometry.LineString(
            [shapely.geometry.Point(a, b), shapely.geometry.Point(c, d)]
        )
    ],
    crs="epsg:4326",
)

# start point of linestring, point we sill calc distance from
l_start = gdf_line.loc[0, "geometry"].coords[0]
# associate every polygon with the line being measures
df_x = gdf_grid.merge(gdf_line.rename(columns={"geometry": "line"}), how="cross")
# intersection will give a line string of the points where line intersects polygon
# from each of these points calc distance to start point
gdf_grid["distances"] = df_x.intersection(gpd.GeoSeries(df_x["line"])).apply(
    lambda g: [round(geopy.distance.geodesic(l_start, p).km, 2) for p in g.coords]
)

# visualize ....
m = gdf_grid.explore(height=200, width=400)
gdf_line.explore(m=m, color="red")

data frames

The line does not pass through some of the polygons, hence no distances.

distances geometry
0 [120.44, 658.42] POLYGON ((27.68400190604637 45.69330801043112, 27.68400190604637 48.41419129088147, 22.48560835133485 48.41419129088147, 22.48560835133485 45.69330801043112, 27.68400190604637 45.69330801043112))
1 [] POLYGON ((27.68400190604637 49.21419129088147, 27.68400190604637 51.9350745713318, 22.48560835133485 51.9350745713318, 22.48560835133485 49.21419129088147, 27.68400190604637 49.21419129088147))
2 [752.25, 937.0] POLYGON ((33.68239546075789 45.69330801043112, 33.68239546075789 48.41419129088147, 28.48400190604637 48.41419129088147, 28.48400190604637 45.69330801043112, 33.68239546075789 45.69330801043112))
3 [1176.08, 1360.15] POLYGON ((33.68239546075789 49.21419129088147, 33.68239546075789 51.9350745713318, 28.48400190604637 51.9350745713318, 28.48400190604637 49.21419129088147, 33.68239546075789 49.21419129088147))
4 [] POLYGON ((39.68078901546941 45.69330801043112, 39.68078901546941 48.41419129088147, 34.48239546075789 48.41419129088147, 34.48239546075789 45.69330801043112, 39.68078901546941 45.69330801043112))
5 [1453.4, 1985.1] POLYGON ((39.68078901546941 49.21419129088147, 39.68078901546941 51.9350745713318, 34.48239546075789 51.9350745713318, 34.48239546075789 49.21419129088147, 39.68078901546941 49.21419129088147))
geometry
0 LINESTRING (22.08560835133486 45.29330801043113, 40.08078901546941 52.3350745713318)

visualisation

Two distances which relate to the two points line intersects polygons that is being hovered over

enter image description here

multiple lines, irregular polygons

  • scope of question has been extended: What if the polygon shapes are irregular and share boundaries? How do I check which intersect line belong to which polygon?
  • countries boundaries are irregular and shared. Used countries as polygons instead
  • have created 3 lines (horizontal, vertical and diagonal). Use exactly same approach, looping through lines and adding distances as columns to polygons geo data frame
gdf_poly = world.loc[(world["continent"]=="Europe") & (~world["iso_a3"].isin(["-99","RUS"]))].copy().reset_index(drop=True)
a,b,c,d = gdf_poly.total_bounds

gdf_line = gpd.GeoDataFrame(
    data={"line": ["diagonal", "vertical", "horizontal"]},
    geometry=[
        shapely.geometry.LineString([(a, b), (c, d)]),
        shapely.geometry.LineString([((a + c) / 2, b), ((a + c) / 2, d)]),
        shapely.geometry.LineString([(a, (b + d) / 2), (c, (b + d) / 2)]),
    ],
    crs="epsg:4326",
)

# multiple lines,  put distances into columns
for _, r in gdf_line.iterrows():
    l_start = r["geometry"].coords[0]

    gdf_poly[r["line"]] = gdf_poly.intersection(
        gpd.GeoSeries([r["geometry"]] * len(gdf_poly), crs=gdf_line.crs)
    ).apply(
        lambda g: [
            round(geopy.distance.geodesic(l_start, p).km, 2)
            for gg in ([g] if isinstance(g, shapely.geometry.LineString) else g.geoms)
            for p in gg.coords
        ]
    )

m = gdf_poly.explore(height=400, width=800)
gdf_line.explore(m=m, color="red")


  • Germany intersects vertical four times, so their are four distances for vertical

enter image description here

Rob Raymond
  • 29,118
  • 3
  • 14
  • 30
  • Thanks a lot! What if the polygon shapes are irregular and share boundaries. How do I check which intersect line belong to which polygon? – athlonshi Mar 08 '22 at 22:21
  • It does I’ll add another example to answer. Concept is simple - intersect line with polygons and get points of each intersection – Rob Raymond Mar 09 '22 at 08:08
  • 1
    updated - with more lines and more complex polygons. it works, just needed to extend to iterate through points in a multiline string intersection – Rob Raymond Mar 09 '22 at 14:43
  • Very much appreciated. While you were updating the example, I was also experimenting it and found out when it returns multiline object instead of linestring object, and figured out it was easy to deal with the irregular geometries. geopandas is a great tool. Thanks again for the kind help! – athlonshi Mar 09 '22 at 16:05
  • great - pls consider accepting my answer – Rob Raymond Mar 09 '22 at 17:27
  • Of course upvoted and accepted the answer – athlonshi Mar 10 '22 at 13:53
1

How about a clip and a spatial join? Here's a toy example. The index_right column in gdf_clip_sj contains the indices of the corresponding polygons from poly_gdf.

import geopandas as gpd
import pandas as pd
from shapely.geometry import Point, Polygon, LineString

ls_geom = [LineString([Point(0, 0), Point(5, 5)]),
           LineString([Point(5, 5), Point(20, 20)])]
ls_gdf = gpd.GeoDataFrame(geometry=ls_geom)



poly_geom = [Polygon([Point(1, 2), Point(1, 4), Point(3, 2)]),
             Polygon([Point(5, 10), Point(5, 20), Point(15, 10)])]

poly_gdf = gpd.GeoDataFrame(geometry=poly_geom)

gdf_clip = ls_gdf.clip(poly_gdf)

gdf_clip_sj = gpd.sjoin(gdf_clip, poly_gdf)

gdf_clip_sj['length'] = gdf_clip_sj['geometry'].length

Output of gdf_clip_sj:

    geometry                                        index_right length
0   LINESTRING (2.00000 2.00000, 2.50000 2.50000)      0        0.707107
1   LINESTRING (10.00000 10.00000, 12.50000 12.50000)  1        3.535534

Viz:

gdf_cat = pd.concat([poly_gdf, ls_gdf]).plot(facecolor='none')

enter image description here

gdf_clip_cat = pd.concat([gdf_clip_sj, poly_gdf])
gdf_clip_cat.plot(facecolor='none')

enter image description here

Edit: Worth noting that my original answer was using Cartesian math, but since you referenced a lat/long crs you may want to use Haversine distance as described in Rob's excellent answer. If you don't want to install any additional packages, you can use the formula given by derricw in this interesting post.

import numpy as np

def haversine_np(lon1, lat1, lon2, lat2):
    """
    Calculate the great circle distance between two points
    on the earth (specified in decimal degrees)

    All args must be of equal length.    

    """
    lon1, lat1, lon2, lat2 = map(np.radians, [lon1, lat1, lon2, lat2])

    dlon = lon2 - lon1
    dlat = lat2 - lat1

    a = np.sin(dlat/2.0)**2 + np.cos(lat1) * np.cos(lat2) * np.sin(dlon/2.0)**2

    c = 2 * np.arcsin(np.sqrt(a))
    km = 6367 * c
    return km

We can implement it by extracting the LineString coordinates to the coords column, and then using apply with the havesine_np function. FWIW, the indexing in this example will only work if each line has only 2 points.

gdf_clip_sj['coords'] = gdf_clip_sj.apply(lambda x: [y for y in x['geometry'].coords], axis=1)
gdf_clip_sj['hd'] = gdf_clip_sj['coords'].apply(lambda x: haversine_np(x[0][0], x[0][1], x[1][0], x[1][1]))

Output of gdf_clip_sj:

    geometry                                 index_right    length     coords                         hd
0   LINESTRING (2.00000 2.00000, 2.50000 2.50000)       0   0.707107    [(2.0, 2.0), (2.5, 2.5)]        78.546912
1   LINESTRING (10.00000 10.00000, 12.50000 12.50000)   1   3.535534    [(10.0, 10.0), (12.5, 12.5)]    389.112802

Quick spot check in QGIS:

enter image description here

Matthew Borish
  • 3,016
  • 2
  • 13
  • 25