0

I have a pandas.core.frame.DataFrame with many attributes. I would like to convert the DF to a GDF and export as a geojson. I have columns 'geometry.type' and 'geometry.coordinates' - both are pandas.core.series.Series. An example exerpt is below - note that geometry.coordinates has a list in it

geometry.type geometry.coordinates
MultiLineString [[[-74.07224, 40.64417], [-74.07012, 40.64506], [-74.06953, 40.64547], [-74.03249, 40.68565], [-74.01335, 40.69824], [-74.0128, 40.69866], [-74.01265, 40.69907], [-74.01296, 40.70048]], [[-74.01296, 40.70048], [-74.01265, 40.69907], [-74.0128, 40.69866], [-74.01335, 40.69824], [-74.03249, 40.68565], [-74.06953, 40.64547], [-74.07012, 40.64506], [-74.07224, 40.64417]]]

I would like concatenate the two for a proper geometry column in order to export the data as a geojson

SMar3552
  • 101
  • 7
  • Hi there. this seems related to your previous post: https://stackoverflow.com/questions/73493033/json-from-url-to-geodataframe. It seems like you're heading in the wrong direction. Can you just post an update to your previous question with the traceback so we can help you read from the geojson file directly rather than debugging the format you've converted the geojson into? – Michael Delgado Aug 26 '22 at 17:09

1 Answers1

0

Taking your previous question as well

  • pandas json_normalize() can be used to create a dataframe from the JSON source. This also expands out the nested dicts
  • it's then a simple case of selecting out columns you want as properties (have renamed as well)
  • build geometry from geometry.coordinates
import urllib.request, json
import pandas as pd
import geopandas as gpd
import shapely.geometry


with urllib.request.urlopen(
    "https://transit.land/api/v2/rest/routes.geojson?operator_onestop_id=o-9q8y-sfmta&api_key=LsyqCJs5aYI6uyxvUz1d0VQQLYoDYdh4&l&"
) as url:
    data = json.loads(url.read())

df = pd.json_normalize(data["features"])
# use just attributes that were properties in input that is almost geojson
gdf = gpd.GeoDataFrame(
    data=df.loc[:, [c for c in df.columns if c.startswith("properties.")]].pipe(
        lambda d: d.rename(columns={c: ".".join(c.split(".")[1:]) for c in d.columns})
    ),
    # build geometry from the co-rodinates
    geometry=df["geometry.coordinates"].apply(shapely.geometry.MultiLineString),
    crs="epsg:4386",
)
gdf
Rob Raymond
  • 29,118
  • 3
  • 14
  • 30