I have a dataframe with longitude and latitude columns. I need to get the county name for the location based on long and lat values with the help of the geoPy package.
longitude latitude housing_median_age total_rooms total_bedrooms \
0 -114.31 34.19 15.0 5612.0 1283.0
1 -114.47 34.40 19.0 7650.0 1901.0
2 -114.56 33.69 17.0 720.0 174.0
3 -114.57 33.64 14.0 1501.0 337.0
4 -114.57 33.57 20.0 1454.0 326.0
population households median_income median_house_value
0 1015.0 472.0 1.4936 66900.0
1 1129.0 463.0 1.8200 80100.0
2 333.0 117.0 1.6509 85700.0
3 515.0 226.0 3.1917 73400.0
4 624.0 262.0 1.9250 65500.0
I had success with a for loop:
geolocator = geopy.Nominatim(user_agent='1234')
for index, row in df.iloc[:10, :].iterrows():
location = geolocator.reverse([row["latitude"], row["longitude"]])
county = location.raw['address']['county']
print(county)
The dataset has 17,000 rows, so that should be a problem, right?
So I've been trying to figure out how to build a function which I could use in pandas.apply() in order to get quicker results.
def get_zipcodes():
location = geolocator.reverse([row["latitude"], row["longitude"]])
county = location.raw['address']['county']
print(county)
counties = get_zipcodes()
I'm stuck and don't know how to use apply (or any other clever method) in here. Help is much appreciated.