4

I'm trying to plot data onto a map. I would like to generate data for specific points on the map (e.g. transit times to one or more prespecified location) for a specific city.

I found data for New York City here: https://data.cityofnewyork.us/City-Government/Borough-Boundaries/tqmj-j8zm

It looks like they have a shapefile available. I'm wondering if there is a way to sample a latitude-longitude grid within the bounds of the shapefile for each borough (perhaps using Shapely package, etc).

Sorry if this is naive, I'm not very familiar with working with these files--I'm doing this as a fun project to learn about them

Georgy
  • 12,464
  • 7
  • 65
  • 73
lstbl
  • 527
  • 5
  • 17
  • Related: [Get all lattice points lying inside a Shapely polygon](https://stackoverflow.com/q/44399749/7851470) – Georgy Feb 28 '20 at 10:31

1 Answers1

12

I figured out how to do this. Essentially, I just created a full grid of points and then removed those that did not fall within the shape files corresponding to the boroughs. Here is the code:

import geopandas
from geopandas import GeoDataFrame, GeoSeries
import matplotlib.pyplot as plt
from matplotlib.colors import Normalize
import matplotlib.cm as cm
%matplotlib inline
import seaborn as sns
from shapely.geometry import Point, Polygon
import numpy as np
import googlemaps
from datetime import datetime
plt.rcParams["figure.figsize"] = [8,6]

# Get the shape-file for NYC
boros = GeoDataFrame.from_file('./Borough Boundaries/geo_export_b641af01-6163-4293-8b3b-e17ca659ed08.shp')
boros = boros.set_index('boro_code')
boros = boros.sort_index()

# Plot and color by borough
boros.plot(column = 'boro_name')

# Get rid of are that you aren't interested in (too far away)
plt.gca().set_xlim([-74.05, -73.85])
plt.gca().set_ylim([40.65, 40.9])

# make a grid of latitude-longitude values
xmin, xmax, ymin, ymax = -74.05, -73.85, 40.65, 40.9
xx, yy = np.meshgrid(np.linspace(xmin,xmax,100), np.linspace(ymin,ymax,100))
xc = xx.flatten()
yc = yy.flatten()

# Now convert these points to geo-data
pts = GeoSeries([Point(x, y) for x, y in zip(xc, yc)])
in_map =  np.array([pts.within(geom) for geom in boros.geometry]).sum(axis=0)
pts = GeoSeries([val for pos,val in enumerate(pts) if in_map[pos]])

# Plot to make sure it makes sense:
pts.plot(markersize=1)

# Now get the lat-long coordinates in a dataframe
coords = []
for n, point in enumerate(pts):
    coords += [','.join(__ for __ in _.strip().split(' ')[::-1]) for _ in str(point).split('(')[1].split(')')[0].split(',')]

which results in the following plots: boroughs

borough_gridpoints

I also got a matrix of lat-long coordinates I used to make a transport-time map for every point in the city to Columbia Medical Campus. Here is that map:

transit_time_full

and a zoomed-up version so you can see how the map is made up of the individual points: transit_time_zoom

lstbl
  • 527
  • 5
  • 17