1

I have a csv file with a table that has the columns Longitude, Latitude, and Wind Speed. I have a code that takes a csv file and deletes values outside of a specified bound. I would like to retain values whose longitude/latitude is within a 0.5 lon/lat radius of a point located at -71.5 longitude and 40.5 latitude.

My example code below deletes any values whose longitude and latitude isn't between -71 to -72 and 40 to 41 respectively. Of course, this retains values within a square bound ±0.5 lon/lat around my point of interest. But I am interested in finding values within a circular bound with radius 0.5 lon/lat of my point of interest. How should I modify my code?

import pandas as pd
import numpy
df = pd.read_csv(r"C:\\Users\\xil15102\\Documents\\results\\EasternLongIsland50.csv") #file path
indexNames=df[(df['Longitude'] <= -72)|(df['Longitude']>=-71)|(df['Latitude']<=40)|(df['Latitude']>=41)].index
df.drop(indexNames,inplace=True)
df.to_csv(r"C:\\Users\\xil15102\\Documents\\results\\EasternLongIsland50.csv")
Bob
  • 115
  • 10

2 Answers2

2

Basically you need to check if a value is a certain distance from a central point (-71.5 and 40.5); to do this use the pythagorean theorem/distance formula:

d = sqrt(dx^2+dy^2).

So programmatically, I would do this like:

from math import sqrt

drop_indices = []

for row in range(len(df)):
    if (sqrt(abs(-71.5 - df[row]['Longitude'])*abs(-71.5 - df[row]['Longitude']) + abs(40.5-df[row]['Latitude'])*abs(40.5-df[row]['Latitude']))) > 0.5:
       drop_indices.append(row)

df.drop(drop_indices)

Sorry that is a sort for disgusting way to get rid of the rows and your way looks much better, but the code should work.

Liam Keeley
  • 187
  • 1
  • 10
  • 1
    Does that handle the case where the POI is near 0°? I know it's not the values from the question but a sound solution should probably also work across the meridian. – BWStearns Dec 15 '19 at 21:48
  • 1
    That is a very good point. To handle this case I guess you would have to test first if it is close and then subtract the given longitude from three hundred and sixty and add that value to whatever close to 0 value you are given to calculate the longitudinal distance. – Liam Keeley Dec 15 '19 at 21:56
  • 1
    This also does not work for larger radii because it uses two dimensional distance on curved space. For a more exact value you should take the sin of the latitude measurements and multiply by the earths radius, and the cos for the latitude values I think. – Liam Keeley Dec 15 '19 at 22:05
  • I think a complete solution is checking if the cone from the center includes the points under test. Super irritating and probably not required for this user's use case but still probably the ultimately correct course for a generalized solution – BWStearns Dec 16 '19 at 04:22
0

You should write a function to calculate the distance from your point of interest and drop those. Some help here. Pretty sure the example below should work if you implement is_not_in_area as a function to calculate the distance and check if dist < 0.5.

df = df.drop(df[is_not_in_area(df.lat, df.lon)].index)

(This code lifted from here)

Edit: drop the ones that aren't in area, not the ones that are haha.

BWStearns
  • 2,567
  • 2
  • 19
  • 33