0

THE DATA: I have co-ordinates of two variables a and b of length 100,000 each and I have a text file containing the co-ordinates of several polygons.

I would like to now remove all those points of a and b that are inside the different polygons.

To do so, I am trying to use THE CODE FROM THIS ANSWER IN STACKOVERFLOW which does it for one point and one polygon.

The method I have chalked out to go about the problem for several points and several polygons is this:

  • Take the co-ordinates of the first polygon
  • Run the function for all the 100,000 points of a and b and if they are inside, then append them to a list, which I can use later to compare with the original a and b
  • Perform the above two steps with the co-ordinates of the second polygon and so on...

Now I have two problems facing me, which I don't know how to proceed with.

  1. The text file containing the co-ordinates of the polygons looks like this:

    020241-041200 4 30.83 -3.69 30.82 -3.69 30.82 -3.73 30.83 -3.73

    020241-041200 12 30.91 -4.03 30.89 -4.03 30.85 -4.05 30.83 -4.07 30.82 -4.09 30.84 -4.16 30.89 -4.19 30.96 -4.16 30.97 -4.13 30.97 -4.08 30.95 -4.05 30.93 -4.04

    Here (020241-041200) is the ID of the polygon, and (4) is the number of corners the polygon has, 30.83 is the X co-ordinate of the first corner and -3.69 is the Y co-ordinate of the first corner and so on.

    I want to skip the first two columns so that I can only consider the X,Y co-ordinates of the polygons. How do I do that?

  2. The polygons are not of the same shape, as you can see, the second polygon has 12 corners compared to 4 in the first one.

THE 100,000 POINTS OF a and b LOOK LIKE THIS THE 100,000 POINTS OF a AND b LOOK LIKE THIS

If there is any convenient way, other than the solution I have given above, it would also be useful.

All I want are, those points of a and b that are outside the polygons.

Community
  • 1
  • 1
Srivatsan
  • 9,225
  • 13
  • 58
  • 83

1 Answers1

2

When you say "those points of a and b that are outside the polygons" do you mean outside all of the polygons or outside any of the polygons?

Here is:

  1. A routine to read in the polygon points and create the appropriate data structure for use with the point_in_poly function.
  2. A routine to check if a point is in any of the polygons.

Here is the routine to read the polygon points from a file:

def readPolygons(path):
  polygons = []
  with open(path) as f:
    for line in f:                                         # (1)
      words = [ float(y) for y in (line.split())[2:] ]     # (2)
      poly = zip (words[::2], words[1::2])                 # (3)
      if len(poly):                                        # (4)
        polygons.append(poly)
  return polygons

Each polygon is represented as a list of pairs of floats, and the routine returns a list of polygons.

Notes:

  1. Iterate over all of the lines in the file.
  2. Split the line into words, drop the first two words with [2:] and convert each word to a float.
  3. Create a list of pairs, taking the 1st, 3rd, 5th, etc as the x coordinates and the 2nd, 4th, 6th, etc as the y coordinates.
  4. Ignore blank lines.

Here is the routine to check if a point is in any polygon:

def inAnyPolygon(x,y,polygons):
  for p in polygons:
    if point_in_poly(x,y,p):
      return True
  return False

If your criteria is "in all the polygons", then use:

def inAllPolygons(x,y,polygons):
  for p in polygons:
    if not point_in_poly(x,y,p):
      return False
  return True

Update: if you have a list of points points, you can create another list containing those points which are not in any of the polygons with:

outliers = []
for p in points:
  (x,y) = p
  if not inAnyPolygons(x,y,polygons):
    outliers.append(p)
return outliers

If a and b are lists of numbers representing the x and y coordinates respectively of the 100000 points, here is the code to find the outliers:

outliers = []
for (x,y) in zip(a,b):
  if not inAnyPolygons(x,y,polygons):
    outliers.append((x,y))
return outliers
ErikR
  • 51,541
  • 9
  • 73
  • 124
  • "those points of a and b that are outside the polygons": I want all those points outside all of my polygons. I shall immediately perform your code and check !!! – Srivatsan Nov 19 '14 at 11:15
  • I have `a and b` which are two variables with 100,000 points each. Your `readPolygons` now gives me a tuple with the co-ordinates of the polygons!!! Now I need to get `a_outside` and `b_outside`, i.e. the points outside the polygons. BTW by `points`, do you mean my `a and b`? – Srivatsan Nov 19 '14 at 11:55
  • Give me an example of what `a` and `b` look like. – ErikR Nov 19 '14 at 12:00
  • They are there in the figure in the question! `a` is a list of 100,000 points ranging from 30.0-38.0 and `b` is a list of 100,000 points ranging from -11.0 to -3.7. The co-ordinates of the several polygons also lie within this range. – Srivatsan Nov 19 '14 at 12:02
  • `readPolygons` returns a list of "polygons" where each "polygon" is a list of pairs of numbers, and it does that because in the `point_in_polygon` function a polygon is represented by a list of pairs of numbers. – ErikR Nov 19 '14 at 12:04
  • Please give me a concrete example for `a` and `b`. Are they lists? Are the lists of numbers? Are they lists of pairs of numbers? It is not clear what they are. – ErikR Nov 19 '14 at 12:05
  • Let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/65205/discussion-between-thepredator-and-user5402). – Srivatsan Nov 19 '14 at 12:05
  • example, `a=array([ 36.34511603, 34.42881285, 32.00052102, 35.20495323, 35.80194333, 36.79451664, 32.08352502, 35.49977049, 33.60233316])` – Srivatsan Nov 19 '14 at 12:09