3

I am attempting to pass data in JSON from an API to a Pandas DataFrame. I could not get pandas.read_json to work with the API data so I'm sure it's not the best solution, but I currently have for loop running through the JSON to extract the values I want.

Here is what I have:

import json
import urllib.request
import pandas as pd

r = urllib.request.urlopen("https://graph.facebook.com/v3.1/{page-id}/insights?access_token={access-token}&pretty=0&metric=page_impressions%2cpage_engaged_users%2cpage_fans%2cpage_video_views%2cpage_posts_impressions").read()

output = json.loads(r)

for item in output['data']:
    name = item['name']
    period = item['period']
    value = item['values'][0]['value']

    df = [{'Name': name, 'Period': period, 'Value': value}]

    df = pd.DataFrame(df)

    print(df)

And here is an excerpt of the JSON from the API:

    {
  "data": [
    {
      "name": "page_video_views",
      "period": "day",
      "values": [
        {
          "value": 634,
          "end_time": "2018-11-23T08:00:00+0000"
        },
        {
          "value": 465,
          "end_time": "2018-11-24T08:00:00+0000"
        }
      ],
      "title": "Daily Total Video Views",
      "description": "Daily: Total number of times videos have been viewed for more than 3 seconds. (Total Count)",
      "id": "{page-id}/insights/page_video_views/day"
    },

The issue I am now facing is because of the For Loop (I believe), each row of data is being inserted into its own DataFrame like so:

               Name Period  Value
0  page_video_views    day    465
               Name Period  Value
0  page_video_views   week   3257
               Name   Period  Value
0  page_video_views  days_28   9987
               Name Period  Value
0  page_impressions    day   1402

How can I pass all of them easily into the same DataFrame like so?

               Name Period  Value
0  page_video_views    day    465
1  page_video_views   week   3257
2  page_video_views  days_28   9987
3  page_impressions    day   1402

Again, I know this most likely isn't the best solution so any suggestions on how to improve any aspect are very welcome.

Cleb
  • 25,102
  • 20
  • 116
  • 151
Hayden
  • 498
  • 2
  • 5
  • 18

3 Answers3

1

You can create list of dictionaries and pass to DataFrame constructor:

L = []
for item in output['data']:
    name = item['name']
    period = item['period']
    value = item['values'][0]['value']

    L.append({'Name': name, 'Period': period, 'Value': value})

df = pd.DataFrame(L)

Or use list comprehension:

L = [({'Name': item['name'], 'Period': item['period'], 'Value': item['values'][0]['value']}) 
       for item in output['data']]

df = pd.DataFrame(L)
print (df)
               Name Period  Value
0  page_video_views    day    634

Sample for testing:

output = {
  "data": [
    {
      "name": "page_video_views",
      "period": "day",
      "values": [
        {
          "value": 634,
          "end_time": "2018-11-23T08:00:00+0000"
        },
        {
          "value": 465,
          "end_time": "2018-11-24T08:00:00+0000"
        }
      ],
      "title": "Daily Total Video Views",
      "description": "Daily: Total number of times videos have been viewed for more than 3 seconds. (Total Count)",
      "id": "{page-id}/insights/page_video_views/day"
    }]}
jezrael
  • 822,522
  • 95
  • 1,334
  • 1,252
0

Try to convert dictionary after json loading to dataframe like:

output = json.loads(r)
df = pd.DataFrame.from_dict(output , orient='index')
df.reset_index(level=0, inplace=True)
Serenity
  • 35,289
  • 20
  • 120
  • 115
  • Throws error "Expected list, got dict" on `df = pd.DataFrame.from_dict(output , orient='index')` – Hayden Nov 28 '18 at 05:22
  • If I could get this to work and avoid the For loop I would be elated, but for some reason the JSON coming from Graph API does not seem to play nice – Hayden Nov 28 '18 at 05:26
0

If you are taking the data from the url. I would suggest this approach and passing only the data stored under an attribute

import request
data=request.get("url here").json('Period')

Period is now dictionary you can now call the pd.DataFrame.from_dict(data) to parse the data

df = pd.DataFrame.from_dict(Period)
varun chauhan
  • 89
  • 1
  • 11