1

I am using Mapbox API to get more information from a region using coordinates. The API calls returns a web .json in which I cannot get pandas to store it as a dataframe using

pandas.read_json

https://pandas.pydata.org/pandas-docs/version/0.25.3/reference/api/pandas.read_json.html

The API request returns a web .json, here is an example of the return .json.

{"type":"FeatureCollection","query":[-73.989,40.733],"features":[{"id":"address.5528394502635160","type":"Feature","place_type":["address"],"relevance":1,"properties":{"accuracy":"rooftop"},"text":"East 13th Street","place_name":"120 East 13th Street, New York, New York 10003, United States","center":[-73.98893045,40.73295105],"geometry":{"type":"Point","coordinates":[-73.98893045,40.73295105]},"address":"120","context":[{"id":"neighborhood.2103290","text":"Greenwich Village"},{"id":"postcode.13482670360296810","text":"10003"},{"id":"locality.12696928000137850","wikidata":"Q11299","text":"Manhattan"},{"id":"place.2618194975964500","wikidata":"Q60","text":"New York"},{"id":"district.12113562209855570","wikidata":"Q500416","text":"New York County"},{"id":"region.17349986251855570","wikidata":"Q1384","short_code":"US-NY","text":"New York"},{"id":"country.19678805456372290","wikidata":"Q30","short_code":"us","text":"United States"}]},{"id":"neighborhood.2103290","type":"Feature","place_type":["neighborhood"],"relevance":1,"properties":{},"text":"Greenwich Village","place_name":"Greenwich Village, New York, New York 10003, United States","bbox":[-74.005282,40.72586,-73.98734,40.73907],"center":[-74.0029,40.7284],"geometry":{"type":"Point","coordinates":[-74.0029,40.7284]},"context":[{"id":"postcode.13482670360296810","text":"10003"},{"id":"locality.12696928000137850","wikidata":"Q11299","text":"Manhattan"},{"id":"place.2618194975964500","wikidata":"Q60","text":"New York"},{"id":"district.12113562209855570","wikidata":"Q500416","text":"New York County"},{"id":"region.17349986251855570","wikidata":"Q1384","short_code":"US-NY","text":"New York"},{"id":"country.19678805456372290","wikidata":"Q30","short_code":"us","text":"United States"}]},{"id":"postcode.13482670360296810","type":"Feature","place_type":["postcode"],"relevance":1,"properties":{},"text":"10003","place_name":"New York, New York 10003, United States","bbox":[-73.9996058238451,40.7229310019,-73.9798620096375,40.7396749960342],"center":[-73.99,40.73],"geometry":{"type":"Point","coordinates":[-73.99,40.73]},"context":[{"id":"locality.12696928000137850","wikidata":"Q11299","text":"Manhattan"},{"id":"place.2618194975964500","wikidata":"Q60","text":"New York"},{"id":"district.12113562209855570","wikidata":"Q500416","text":"New York County"},{"id":"region.17349986251855570","wikidata":"Q1384","short_code":"US-NY","text":"New York"},{"id":"country.19678805456372290","wikidata":"Q30","short_code":"us","text":"United States"}]},{"id":"locality.12696928000137850","type":"Feature","place_type":["locality"],"relevance":1,"properties":{"wikidata":"Q11299"},"text":"Manhattan","place_name":"Manhattan, New York, United States","bbox":[-74.047313153061,40.679573,-73.907,40.8820749648427],"center":[-73.9597,40.7903],"geometry":{"type":"Point","coordinates":[-73.9597,40.7903]},"context":[{"id":"place.2618194975964500","wikidata":"Q60","text":"New York"},{"id":"district.12113562209855570","wikidata":"Q500416","text":"New York County"},{"id":"region.17349986251855570","wikidata":"Q1384","short_code":"US-NY","text":"New York"},{"id":"country.19678805456372290","wikidata":"Q30","short_code":"us","text":"United States"}]},{"id":"place.2618194975964500","type":"Feature","place_type":["place"],"relevance":1,"properties":{"wikidata":"Q60"},"text":"New York","place_name":"New York, New York, United States","bbox":[-74.25909,40.477399,-73.700272,40.917577],"center":[-73.9866,40.7306],"geometry":{"type":"Point","coordinates":[-73.9866,40.7306]},"context":[{"id":"district.12113562209855570","wikidata":"Q500416","text":"New York County"},{"id":"region.17349986251855570","wikidata":"Q1384","short_code":"US-NY","text":"New York"},{"id":"country.19678805456372290","wikidata":"Q30","short_code":"us","text":"United States"}]},{"id":"district.12113562209855570","type":"Feature","place_type":["district"],"relevance":1,"properties":{"wikidata":"Q500416"},"text":"New York County","place_name":"New York County, New York, United States","bbox":[-74.047227,40.682932,-73.907,40.879278],"center":[-74,40.7167],"geometry":{"type":"Point","coordinates":[-74,40.7167]},"context":[{"id":"region.17349986251855570","wikidata":"Q1384","short_code":"US-NY","text":"New York"},{"id":"country.19678805456372290","wikidata":"Q30","short_code":"us","text":"United States"}]},{"id":"region.17349986251855570","type":"Feature","place_type":["region"],"relevance":1,"properties":{"wikidata":"Q1384","short_code":"US-NY"},"text":"New York","place_name":"New York, United States","bbox":[-79.8578350999901,40.4771391062446,-71.7564918092633,45.0239286969073],"center":[-75.4652471468304,42.751210955],"geometry":{"type":"Point","coordinates":[-75.4652471468304,42.751210955]},"context":[{"id":"country.19678805456372290","wikidata":"Q30","short_code":"us","text":"United States"}]},{"id":"country.19678805456372290","type":"Feature","place_type":["country"],"relevance":1,"properties":{"wikidata":"Q30","short_code":"us"},"text":"United States","place_name":"United States","bbox":[-179.9,18.8163608007951,-66.8847646185949,71.4202919997506],"center":[-97.9222112121185,39.3812661305678],"geometry":{"type":"Point","coordinates":[-97.9222112121185,39.3812661305678]}}],"attribution":"NOTICE: © 2021 Mapbox and its suppliers. All rights reserved. Use of this data is subject to the Mapbox Terms of Service (https://www.mapbox.com/about/maps/). This response and the information it contains may not be retained. POI(s) provided by Foursquare."}

Here is my code:

url = "https://api.mapbox.com/geocoding/v5/mapbox.places/-73.989,40.733.json?access_token=MY_KEY_HERE"

df = pd.read_json(url, orient='split')

return df

I have tried orient = 'split', 'index', 'records', 'columns', and 'values' but it returns: "ValueError: arrays must all be same length" most of the time. What do I need to do to get pandas.read_json to recognize this API return as a valid .json?

Output: ValueError: arrays must all be same length

Expect: The returned .json to be read and stored into a pandas dataframe

Anonymous
  • 453
  • 1
  • 6
  • 14
  • 2
    Pandas can't take any random json input. It has to be possibly to even represent the data as a table, which clearly isn't possible with that json blob of nested fields. You'll have to construct the dataframe with your own code – Mikael Öhman Jul 20 '21 at 01:00
  • Hi Mikael, do you have more information/an example of what you mean? I'm not quite sure what you mean by having to construct the dataframe with my own code if I'm trying to import a .json. – Anonymous Jul 20 '21 at 22:48
  • 1
    You have to decide on exactly what columns you want in your dataframe. Then you need towrite your custom code that extracts those pieces of information from your results. You'll want to do that part like https://stackoverflow.com/questions/6386308/http-requests-and-json-parsing-in-python and then write your own loop(s) to extract what the information you wanted and add that to your dataframe – Mikael Öhman Jul 21 '21 at 01:07
  • That link is a helpful resource, let me try it. Thank you. – Anonymous Jul 21 '21 at 06:00

1 Answers1

1

Pandas provides a utility function pd.json_normalize(), to normalize semi-structured data into a flat table. For your json response this should work:

import pandas as pd

data = <your JSON response>

df = pd.json_normalize(data, 'features')

You can find examples in the latest version of the pandas user guide. For some reason I couldn't find a dedicated page in the latest version of the pandas API reference though. However, it's mentioned in older versions, here you can also find a list of parameters.

sarrysyst
  • 217
  • 1
  • 8
  • Thank you very much! This resolves the problem! Thank you for the documentation and the example. I now understand why you had the 'features' parameter. You're a life saver. – Anonymous Jul 23 '21 at 18:24