0

This is my JSON string, I want to make it read into dataframe in the following tabular format.

I have no idea what should I do after pd.Dataframe(json.loads(data))

JSON data, edited

{  
       "data":[  
          {  
             "data":{  
                "actual":"(0.2)",
                "upper_end_of_central_tendency":"-"
             },
             "title":"2009"
          },
          {  
             "data":{  
                "actual":"2.8",
                "upper_end_of_central_tendency":"-"
             },
             "title":"2010"
          },
          {  
             "data":{  
                "actual":"-",
                "upper_end_of_central_tendency":"2.3"
             },
             "title":"longer_run"
          }
       ],
       "schedule_id":"2014-03-19"
    }
Jonathan Eunice
  • 21,653
  • 6
  • 75
  • 77
user3675188
  • 7,271
  • 11
  • 40
  • 76
  • Have a look [here](http://stackoverflow.com/questions/13575090/construct-pandas-dataframe-from-items-in-nested-dictionary) – camdenl Mar 25 '15 at 14:13

1 Answers1

1

That's a somewhat overly nested JSON. But if that's what you have to work with, and assuming your parsed JSON is in jdata:

datapts = jdata['data']
rownames = ['actual', 'upper_end_of_central_tendency']
colnames = [ item['title'] for item in datapts ] + ['schedule_id' ]
sched_id = jdata['schedule_id']
rows = [ [item['data'][rn] for item in datapts ] + [sched_id] for rn in rownames]
df = pd.DataFrame(rows, index=rownames, columns=colnames)

df is now:

resulting data frame

If you wanted to simplify that a bit, you could construct the core data without the asymmetric schedule_id field, then add that after the fact:

datapts = jdata['data']
rownames = ['actual', 'upper_end_of_central_tendency']
colnames = [ item['title'] for item in datapts ] 
rows = [ [item['data'][rn] for item in datapts ] for rn in rownames]
d2 = pd.DataFrame(rows, index=rownames, columns=colnames)
d2['schedule_id'] = jdata['schedule_id']

That will make an identical DataFrame (i.e. df == d2). It helps when learning pandas to try a few different construction strategies, and get a feel for what is more straightforward. There are more powerful tools for unfolding nested structures into flatter tables, but they're not as easy to understand first time out of the gate.

(Update) If you wanted a better structuring on your JSON to make it easier to put into this format, ask pandas what it likes. E.g. df.to_json() output, slightly prettified:

{
  "2009": {
    "actual": "(0.2)",
    "upper_end_of_central_tendency": "-"
  },
  "2010": {
    "actual": "2.8",
    "upper_end_of_central_tendency": "-"
  },
  "longer_run": {
    "actual": "-",
    "upper_end_of_central_tendency": "2.3"
  },
  "schedule_id": {
    "actual": "2014-03-19",
    "upper_end_of_central_tendency": "2014-03-19"
  }
}

That is a format from which pandas' read_json function will immediately construct the DataFrame you desire.

Jonathan Eunice
  • 21,653
  • 6
  • 75
  • 77