0

I have a dataframe with columns of x, y, data. and I have a list for polygon coordinate like this: polygon=[x1,y1,x2,y2,x3,y3,x4,y4]

I'd like to filter out all the rows of dataframe with (x,y) outside the polygon


df.columns=['x','y','data']
polygon=[x1,y1,x2,y2,x3,y3,x4,y4]

df_1= df inside polygon

How to implement the last line? Thanks

martineau
  • 119,623
  • 25
  • 170
  • 301
roudan
  • 3,082
  • 5
  • 31
  • 72
  • Have you seen this: https://stackoverflow.com/questions/36399381/whats-the-fastest-way-of-checking-if-a-point-is-inside-a-polygon-in-python – tozCSS Jan 05 '22 at 00:29

1 Answers1

1

See Shapely:

from shapely.geometry import Point
from shapely.geometry.polygon import Polygon

df['point'] = df.apply(lambda row: Point(row['x'],row['y']),axis=1)
polygon = Polygon([(x1,y1), (x2,y2), (x3,y3), (x4,y4)])
df_1 = df[df['point'].apply(polygon.contains)].copy()
tozCSS
  • 5,487
  • 2
  • 34
  • 31
  • Yes I saw it but was thinking to not use Shapely. Ok I installed Shapely now. Not I got another error for the first line, TypeError: cannot convert the series to , any siggestion. In my dataframe, there is 3 columns, x, y and data – roudan Jan 05 '22 at 00:47
  • 1
    Can you do: df['x'] = df['x'].astype(float) and df['y'] = df['y'].astype(float) ? – tozCSS Jan 05 '22 at 00:51
  • the same error, I think the problem is lambda row: Point(row["x"],row["y"], row['x'] is a series) – roudan Jan 05 '22 at 00:54
  • if you get this error it means that some of your x and/or y are not floating point numbers. – tozCSS Jan 05 '22 at 00:55
  • I check it:, they are all float type, here is the output by doing df.info(): Int64Index: 9668 entries, 0 to 12849 Data columns (total 5 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 X 9668 non-null float64 1 Y 9668 non-null float64 – roudan Jan 05 '22 at 00:58
  • Hi if I just do this, then it is fine, df = df.assign(point = lambda row: row["X"]+row["Y"]) – roudan Jan 05 '22 at 01:05
  • 1
    sorry my bad, fixed the code. – tozCSS Jan 05 '22 at 01:17
  • Let us [continue this discussion in chat](https://chat.stackoverflow.com/rooms/240741/discussion-between-tozcss-and-roudan). – tozCSS Jan 05 '22 at 01:18
  • Thanks tozCSS, yes it works now. I appreicate your help. I am wondering why this one doesn't work? df = df.assign(point = lambda row: Point(row["X"],row["Y"]),axis=1) i added axis=1 inise assign() and it got same error, why? – roudan Jan 05 '22 at 01:47