29

How can I check if a geopoint is within the area of a given shapefile?

I managed to load a shapefile in python, but can't get any further.

zadrozny
  • 1,631
  • 3
  • 22
  • 27
Gerald Bäck
  • 445
  • 1
  • 5
  • 7

7 Answers7

33

Another option is to use Shapely (a Python library based on GEOS, the engine for PostGIS) and Fiona (which is basically for reading/writing files):

import fiona
import shapely

with fiona.open("path/to/shapefile.shp") as fiona_collection:

    # In this case, we'll assume the shapefile only has one record/layer (e.g., the shapefile
    # is just for the borders of a single country, etc.).
    shapefile_record = fiona_collection.next()

    # Use Shapely to create the polygon
    shape = shapely.geometry.asShape( shapefile_record['geometry'] )

    point = shapely.geometry.Point(32.398516, -39.754028) # longitude, latitude

    # Alternative: if point.within(shape)
    if shape.contains(point):
        print "Found shape for point."

Note that doing point-in-polygon tests can be expensive if the polygon is large/complicated (e.g., shapefiles for some countries with extremely irregular coastlines). In some cases it can help to use bounding boxes to quickly rule things out before doing the more intensive test:

minx, miny, maxx, maxy = shape.bounds
bounding_box = shapely.geometry.box(minx, miny, maxx, maxy)

if bounding_box.contains(point):
    ...

Lastly, keep in mind that it takes some time to load and parse large/irregular shapefiles (unfortunately, those types of polygons are often expensive to keep in memory, too).

clint
  • 14,402
  • 12
  • 70
  • 79
  • I have many points and many polygons (2000 each). What is the fast method to find each polygon for each point? I have problem in developing the code. – mah65 Nov 08 '20 at 05:50
  • 1
    Excellent answer, using `shapely` / `fiona` is just so much easier than using `ogr`. Just one more thing i found useful: if you're using multiple layers/features in a single shapefile you can just iterate over the elements in the collection, use the `asShape` method and return the correct feature if it contains your point. This can also be used to preserve any attributes (like names and codes) that might be present on the feature. – Husky Jan 03 '21 at 23:40
21

This is an adaptation of yosukesabai's answer.

I wanted to ensure that the point I was searching for was in the same projection system as the shapefile, so I've added code for that.

I couldn't understand why he was doing a contains test on ply = feat_in.GetGeometryRef() (in my testing things seemed to work just as well without it), so I removed that.

I've also improved the commenting to better explain what's going on (as I understand it).

#!/usr/bin/python
import ogr
from IPython import embed
import sys

drv = ogr.GetDriverByName('ESRI Shapefile') #We will load a shape file
ds_in = drv.Open("MN.shp")    #Get the contents of the shape file
lyr_in = ds_in.GetLayer(0)    #Get the shape file's first layer

#Put the title of the field you are interested in here
idx_reg = lyr_in.GetLayerDefn().GetFieldIndex("P_Loc_Nm")

#If the latitude/longitude we're going to use is not in the projection
#of the shapefile, then we will get erroneous results.
#The following assumes that the latitude longitude is in WGS84
#This is identified by the number "4326", as in "EPSG:4326"
#We will create a transformation between this and the shapefile's
#project, whatever it may be
geo_ref = lyr_in.GetSpatialRef()
point_ref=ogr.osr.SpatialReference()
point_ref.ImportFromEPSG(4326)
ctran=ogr.osr.CoordinateTransformation(point_ref,geo_ref)

def check(lon, lat):
    #Transform incoming longitude/latitude to the shapefile's projection
    [lon,lat,z]=ctran.TransformPoint(lon,lat)

    #Create a point
    pt = ogr.Geometry(ogr.wkbPoint)
    pt.SetPoint_2D(0, lon, lat)

    #Set up a spatial filter such that the only features we see when we
    #loop through "lyr_in" are those which overlap the point defined above
    lyr_in.SetSpatialFilter(pt)

    #Loop through the overlapped features and display the field of interest
    for feat_in in lyr_in:
        print lon, lat, feat_in.GetFieldAsString(idx_reg)

#Take command-line input and do all this
check(float(sys.argv[1]),float(sys.argv[2]))
#check(-95,47)

This site, this site, and this site were helpful regarding the projection check. EPSG:4326

thagorn
  • 727
  • 7
  • 14
Richard
  • 56,349
  • 34
  • 180
  • 251
  • 1
    Thank you for improved answer. i recall i just copied code from my work at the time and there was some dirtiness left, like left over of what i did for some other purpose. I tried to clean it for posting here but that introduced some error too as you picked up in your edit. – yosukesabai Nov 18 '12 at 15:32
  • 1
    No problem, @yosukesabai, thanks so much for pointing the way. It seems as though documentation and examples for these packages are lacking. Do you, perchance, have thoughts regarding [this question](http://stackoverflow.com/questions/13439357/extract-point-from-raster-in-gdal)? – Richard Nov 18 '12 at 17:37
  • gdal's dataset object seems to have GetProjection() method which seems to return well known text (WKT) for the projection. i guess you can convert that to projection object somehow in osr. – yosukesabai Nov 18 '12 at 22:10
  • And I agree, gdal documentation for python binding is not very good... What I usuall do is to make object and then use dir() on it. Identify methods or property which seems like what i want to do, and then look it up in C or other documentation. – yosukesabai Nov 19 '12 at 00:32
  • 1
    WGS84 is 4326 so this should say point_ref.ImportFromEPSG(4326), this code as written will lead to subtly incorrect shape file reading. – thagorn Sep 02 '15 at 16:22
  • 1
    Thanks for editing it, @thagorn. That was a silly mistake to make! 4236 is the "Hu Tzu Shan 1950" projection. 4326 is definitely correct. – Richard Sep 02 '15 at 19:50
  • Let's say I have GPS data (list of lat,lng values) and I want to query that GPS data to check whether the movement was on land or water. Is the data that detailed to catch ferry rides in US atleast ? Or I am better of comparing these coordinates with the water body data set. The max file size was 111 mb for water body data, so I am not sure if that is much detail or not. – nr5 Apr 15 '19 at 08:45
  • @nr5: I recommend loading some subsets of your data into QGIS and checking this manually. It depends a bit on the accuracy you need. For the US check the US Census shapefiles. – Richard Apr 15 '19 at 16:03
17

Here is a simple solution based on pyshp and shapely.

Let's assume that your shapefile only contains one polygon (but you can easily adapt for multiple polygons):

import shapefile
from shapely.geometry import shape, Point

# read your shapefile
r = shapefile.Reader("your_shapefile.shp")

# get the shapes
shapes = r.shapes()

# build a shapely polygon from your shape
polygon = shape(shapes[0])    

def check(lon, lat):
    # build a shapely point from your geopoint
    point = Point(lon, lat)

    # the contains function does exactly what you want
    return polygon.contains(point)
chilladx
  • 2,037
  • 2
  • 16
  • 19
  • [Fiona](https://pypi.python.org/pypi/Fiona) is also a good alternative to pyshp, and make it easy to parse the shapefile and get different polygons. – chilladx Mar 11 '17 at 12:48
  • Note, pyshp is much more portable (will work on more computers and servers) and lighter weight than Fiona. They're quite different behind the scenes. – Preston Badeer Sep 25 '22 at 06:08
2

i did almost exactly what you are doing yesterday using gdal's ogr with python binding. It looked like this.

import ogr

# load the shape file as a layer
drv    = ogr.GetDriverByName('ESRI Shapefile')
ds_in  = drv.Open("./shp_reg/satreg_etx12_wgs84.shp")
lyr_in = ds_in.GetLayer(0)

# field index for which i want the data extracted 
# ("satreg2" was what i was looking for)
idx_reg = lyr_in.GetLayerDefn().GetFieldIndex("satreg2")


def check(lon, lat):
  # create point geometry
  pt = ogr.Geometry(ogr.wkbPoint)
  pt.SetPoint_2D(0, lon, lat)
  lyr_in.SetSpatialFilter(pt)

  # go over all the polygons in the layer see if one include the point
  for feat_in in lyr_in:
    # roughly subsets features, instead of go over everything
    ply = feat_in.GetGeometryRef()

    # test
    if ply.Contains(pt):
      # TODO do what you need to do here
      print(lon, lat, feat_in.GetFieldAsString(idx_reg))
Richard
  • 56,349
  • 34
  • 180
  • 251
yosukesabai
  • 6,184
  • 4
  • 30
  • 42
1

One way to do this is to read the ESRI Shape file using the OGR library Link and then use the GEOS geometry library http://trac.osgeo.org/geos/ to do the point-in-polygon test. This requires some C/C++ programming.

There is also a python interface to GEOS at http://sgillies.net/blog/14/python-geos-module/ (which I have never used). Maybe that is what you want?

Another solution is to use the http://geotools.org/ library. That is in Java.

I also have my own Java software to do this (which you can download from http://www.mapyrus.org plus jts.jar from http://www.vividsolutions.com/products.asp ). You need only a text command file inside.mapyrus containing the following lines to check if a point lays inside the first polygon in the ESRI Shape file:

dataset "shapefile", "us_states.shp"
fetch
print contains(GEOMETRY, -120, 46)

And run with:

java -cp mapyrus.jar:jts-1.8.jar org.mapyrus.Mapyrus inside.mapyrus

It will print a 1 if the point is inside, 0 otherwise.

You might also get some good answers if you post this question on https://gis.stackexchange.com/

Glorfindel
  • 21,988
  • 13
  • 81
  • 109
Simon C
  • 1,977
  • 11
  • 14
0

If you want to find out which polygon (from a shapefile full of them) contains a given point (and you have a bunch of points as well), the fastest way is using postgis. I actually implemented a fiona based version, using the answers here, but it was painfully slow (I was using multiprocessing and checking bounding box first). 400 minutes of processing = 50k points. Using postgis, that took less than 10seconds. B tree indexes are efficient!

shp2pgsql -s 4326 shapes.shp > shapes.sql

That will generate a sql file with the information from the shapefiles, create a database with postgis support and run that sql. Create a gist index on the geom column. Then, to find the name of the polygon:

sql="SELECT name FROM shapes WHERE ST_Contains(geom,ST_SetSRID(ST_MakePoint(%s,%s),4326));"
cur.execute(sql,(x,y))
Fábio Dias
  • 636
  • 8
  • 18