Convert JSON list to pandas dataframe

Question

I have very large json data with the following syntax:

[
 {
   "origin": 101011001,
   "destinations": [
    {"destination": 101011001, "people": 7378},
    {"destination": 101011002, "people": 120}
   ]
 },
 {
   "origin": 101011002,
   "destinations": [
    {"destination": 101011001, "people": 754},
 }
]

[enter image description here][1]

My goal is the convert the data to a pandas dataframe which I then want to convert to sql to store as a table in my postgresql database.

I want to create a pandas dataframe like this:

origin    destination  people
101011001 101011001    7378
101011001 101011002    120 
101011002 101011001    754

Right now, I can only get columns 'origin' and 'destinations' where destinations is a list containing both the destination and people values, using pandas.read_json().

How can I achieve the above dataframe?

Partha Mandal · Accepted Answer · 2020-05-16T15:49:49.890

1

Use json_normalize. This should work as intended:

Edit (from string to list of dicts, then json_normalize)

data = """[
 {
   "origin": 101011001,
   "destinations": [
    {"destination": 101011001, "people": 7378},
    {"destination": 101011002, "people": 120}
   ]
 },
 {
   "origin": 101011002,
   "destinations": [
    {"destination": 101011001, "people": 754}]
 }
]"""

from pandas import json_normalize
import json

data = json.loads(data)
df = json_normalize(data,"destinations",['origin'])
df.head()

edited May 16 '20 at 15:49

answered May 16 '20 at 15:11

Partha Mandal

1,391
8
14

My data is from a webapi, which I used requests.get(url), and with 60000+ values. What method should I use to load the data for ur code? – Jenny Char May 16 '20 at 15:26
`json_normalize` requires the data to be a `list` of `dicts` - which is what your data looks like (in example) - how does it look in reality? – Partha Mandal May 16 '20 at 15:34
my data looks like the above but its not a dictionary since I'm getting it from a webapi in json format – Jenny Char May 16 '20 at 15:37
So, a `string`? Use `import json;data = json.loads(data)` to convert it to the format above! – Partha Mandal May 16 '20 at 15:40
sorry I am quite new at json so I have no idea what many things mean – Jenny Char May 16 '20 at 15:45
can you check the `type` of your data there using `type(data)` and see if it's `string`? Or share the website you are talking about? – Partha Mandal May 16 '20 at 15:47
originally i was using this: response = requests.get("url") then pd.read_json(response.text) which is not right but I'm not sure how to load the data otherwise – Jenny Char May 16 '20 at 15:47
I can't share the website since it is from a secured network which I can only access through a vpn :( – Jenny Char May 16 '20 at 15:49
I will add a screenshot of the data to the post – Jenny Char May 16 '20 at 15:50
Ok! `type(response.text)` should be a `string` - so, you can just use the modifications I made - `data = json.loads(response.text)` and then go on. – Partha Mandal May 16 '20 at 15:51
Great! Can you please accept it as answer then? (if you haven't already!) – Partha Mandal May 16 '20 at 15:57

Convert JSON list to pandas dataframe

1 Answers1

Linked