How to read this JSON into dataframe with specfic dataframe format

Question

This is my JSON string, I want to make it read into dataframe in the following tabular format.

I have no idea what should I do after pd.Dataframe(json.loads(data))

JSON data, edited

{  
       "data":[  
          {  
             "data":{  
                "actual":"(0.2)",
                "upper_end_of_central_tendency":"-"
             },
             "title":"2009"
          },
          {  
             "data":{  
                "actual":"2.8",
                "upper_end_of_central_tendency":"-"
             },
             "title":"2010"
          },
          {  
             "data":{  
                "actual":"-",
                "upper_end_of_central_tendency":"2.3"
             },
             "title":"longer_run"
          }
       ],
       "schedule_id":"2014-03-19"
    }

Have a look [here](http://stackoverflow.com/questions/13575090/construct-pandas-dataframe-from-items-in-nested-dictionary) — camdenl, Mar 25 '15 at 14:13

Jonathan Eunice · Accepted Answer · 2015-03-26T03:24:26.880

That's a somewhat overly nested JSON. But if that's what you have to work with, and assuming your parsed JSON is in jdata:

datapts = jdata['data']
rownames = ['actual', 'upper_end_of_central_tendency']
colnames = [ item['title'] for item in datapts ] + ['schedule_id' ]
sched_id = jdata['schedule_id']
rows = [ [item['data'][rn] for item in datapts ] + [sched_id] for rn in rownames]
df = pd.DataFrame(rows, index=rownames, columns=colnames)

df is now:

resulting data frame

If you wanted to simplify that a bit, you could construct the core data without the asymmetric schedule_id field, then add that after the fact:

datapts = jdata['data']
rownames = ['actual', 'upper_end_of_central_tendency']
colnames = [ item['title'] for item in datapts ] 
rows = [ [item['data'][rn] for item in datapts ] for rn in rownames]
d2 = pd.DataFrame(rows, index=rownames, columns=colnames)
d2['schedule_id'] = jdata['schedule_id']

That will make an identical DataFrame (i.e. df == d2). It helps when learning pandas to try a few different construction strategies, and get a feel for what is more straightforward. There are more powerful tools for unfolding nested structures into flatter tables, but they're not as easy to understand first time out of the gate.

(Update) If you wanted a better structuring on your JSON to make it easier to put into this format, ask pandas what it likes. E.g. df.to_json() output, slightly prettified:

{
  "2009": {
    "actual": "(0.2)",
    "upper_end_of_central_tendency": "-"
  },
  "2010": {
    "actual": "2.8",
    "upper_end_of_central_tendency": "-"
  },
  "longer_run": {
    "actual": "-",
    "upper_end_of_central_tendency": "2.3"
  },
  "schedule_id": {
    "actual": "2014-03-19",
    "upper_end_of_central_tendency": "2014-03-19"
  }
}

That is a format from which pandas' read_json function will immediately construct the DataFrame you desire.

Thanks for your advice. To make it eaiser to read into dataframe, how do you think that how can I change the JSON format ? — user3675188, Mar 26 '15 at 03:01
I demonstrated the better JSON format under "(Update)" above — Jonathan Eunice, Mar 26 '15 at 03:25

How to read this JSON into dataframe with specfic dataframe format

JSON data, edited

1 Answers1