0

In python need to flatten a large nested Dictionary that starts like this:

{u'February 19, 2016': {'calls': [{'%change': u'0.00%',
'ask': u'6.50',
'bid': u'5.20',
'change': u'0.00',
'interest': u'10',
'last': u'10.30',
'name': u'LVLT160219C00044000',
'strike': u'44.00',
'volatility': u'62.31%',
'volume': u'10'}]}}

into a DataFrame with columns similar to:

name  strike  date  type   last  ask bid  change  volume  volatility 

Thanks

Paul Rooney
  • 20,879
  • 9
  • 40
  • 61
Jameson
  • 3
  • 2
  • could you show us your attempt? – midori Feb 05 '16 at 01:34
  • 1
    make a working example too with the input, the braces and bracket are unbalanced – jxramos Feb 05 '16 at 01:43
  • @minitoto I would love to, haven't even gotten close yet. The method here http://stackoverflow.com/questions/10756427/loop-through-all-nested-dictionary-values is possibly showing some progress at the moment though. – Jameson Feb 05 '16 at 02:20

1 Answers1

1

I would loop through your structure and cast it into a new format that pandas can automatically recognize, like a sequence of dicts. You'll have to customize it for your exact needs but this is a proof of concept based on your current data structure.

import pandas as pd
#your data looks something like this:
mydict={"date1" : {"calls": [ {"change":1,"ask":4,"bid":5,"name":"x83"},
                              {"change":3,"ask":9,"bid":2,"name":"y99"} ] },
        "date2" : {"calls": [ {"change":4,"ask":3,"bid":7,"name":"z32"} ] } }

def convert(something):
    # Convert strings to floats, unless they're really strings
    try: return float(something)
    except ValueError: return something

# make an empty sequence
dataseq = []
# list the fields you want from each call
desired = ["ask","change","bid","name"]

for thisdate in mydict.keys():
    # get the calls for each date
    for thiscall in mydict[thisdate]["calls"]:
        # initialize a new dictionary for this call with the entry date
        d = {"date":thisdate}
        for field in desired:
            # get the data and convert to float if it's a float
            d[field]=convert(thiscall[field])
        # add it to your sequence
        dataseq.append(d)
# make a dataframe
a = pd.DataFrame(dataseq)
22degrees
  • 596
  • 4
  • 11