Turning JSON into dataframe with pandas

Question

I'm trying to get a data frame but keep running into various error messages depending on the arguments I specify in read.json after I specify my file.

I've run through many of the arguments in the pandas.read_json documentation, but haven't been able to identify a solution.

import pandas
json_file = "https://gis.fema.gov/arcgis/rest/services/NSS/OpenShelters/MapServer/0/query?where=1%3D1&outFields=*&returnGeometry=false&outSR=4326&f=json"
pandas.read_json(json_file)

I'm trying to get a data frame but keep running into various error messages depending on the arguments I specify in read.json after I specify my file.

https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.from_records.html — Alex, Jul 09 '19 at 18:02
https://stackoverflow.com/questions/21104592/json-to-pandas-dataframe I think your task looks similar to this one! — Deep Bhatt, Jul 09 '19 at 18:07

score 1 · Answer 1 · answered Jul 09 '19 at 18:47

Because the JSON is not directly convertible to DataFrame. read_json works only with a few formats defined by the orient parameter. Your JSON doesn't follow any of the allowed formats so you need to manipulate the JSON before converting it to a data frame.

Let's take a high level look at your JSON:

{
    "displayFieldName": ...,
    "fieldAliases": {...},
    "fields": {...},
    "features": [...]
}

I'm gonna fathom a guess and assume the features node is what you want. Let's div deeper into features:

"features": [
    {
        "attributes": {
            "OBJECTID": 1,
            "SHELTER_ID": 223259,
            ...
        }
    },
    {
        "attributes": {
            "OBJECTID": 2,
            "SHELTER_ID": 223331,
            ...
        }
    },
    ...
]

features contains a list of objects, each having an attributes node. The data contained in the attributes node is what you actually want.

Here's the code

import pandas as pd
import json
from urllib.request import urlopen

json_file = "https://gis.fema.gov/arcgis/rest/services/NSS/OpenShelters/MapServer/0/query?where=1%3D1&outFields=*&returnGeometry=false&outSR=4326&f=json"
data = urlopen(json_file).read()
raw_json = json.loads(data)
formatted_json = [feature['attributes'] for feature in raw_json['features']]

formatted_json is now a list of dictionaries containing the data we are after. It is no longer JSON. To create the data frame:

df = pd.DataFrame(formatted_json)

This works, but now I am having issues with the proxy. I've tried mapping my proxies and then feeding it as a parameter to urlopen(), but that's only for Python 2. I tried working with ProxyHandler but couldn't get anything to work. — Jeff Hall, Jul 10 '19 at 11:23
The network is a separate issue from the programming one. I can't help with that, sorry :( — Code Different, Jul 10 '19 at 11:25

Turning JSON into dataframe with pandas

1 Answers1