4

I have a 3-level dictionary like this:

data={'2016-11-28': {'area1': {'am': -0.007, 'pm': 0.008}, 'area2': {'am': 0.0, 'pm': 0.0}, 'area3': {'am': -0.01, 'pm': -0.001}},'2016-11-29':{'area1': {'am': -0.007, 'pm': 0.008}, 'area2': {'am': 0.0, 'pm': 0.0}, 'area3': {'am': -0.01, 'pm': -0.001}}}

I want to convert it to a dataframe, and I tried:

tickers=data['2016-11-28'].keys()
iterables=[tickers,['am','pm']]
index=pd.MultiIndex.from_product(iterables, names=['ticker', 'time'])
frame=pd.DataFrame(data,index=index)

but I got

                2016-11-28  2016-11-29
ticker time                        
area1  am           NaN         NaN
       pm           NaN         NaN
area3  am           NaN         NaN
       pm           NaN         NaN
area2  am           NaN         NaN
       pm           NaN         NaN

There are no values in the dataframe, only column names and index names. What's wrong with my code? Can someone help? Thanks very much!

Dimitris Fasarakis Hilliard
  • 150,925
  • 31
  • 268
  • 253
Steven
  • 95
  • 1
  • 4
  • 1
    http://stackoverflow.com/questions/13575090/construct-pandas-dataframe-from-items-in-nested-dictionary – acushner Dec 22 '16 at 15:28

1 Answers1

2

Here is my own solution: triple for loop to force dictionary to conform to rules for hierarchical index which is {'col1':{('row1_level0', 'row1_level1'):value}}

Which will look like this when used

pd.DataFrame({'col1':{('rowidx0_level0', 'rowidx0_level1'):5}})

                         col1
rowidx0_level0 rowidx0_level1     5

And here is the implementation

d = {}
for date, areas in data.items():
    d[date] = {}
    for area, times in areas.items():
        for time, value in times.items():
            d[date][(area, time)] = value 

pd.DataFrame(d)

          2016-11-28  2016-11-29
area1 am      -0.007      -0.007
      pm       0.008       0.008
area2 am       0.000       0.000
      pm       0.000       0.000
area3 am      -0.010      -0.010
      pm      -0.001      -0.001

And this is what the actual dictionary d looks like:

{'2016-11-28': {('area1', 'am'): -0.007,
  ('area1', 'pm'): 0.008,
  ('area2', 'am'): 0.0,
  ('area2', 'pm'): 0.0,
  ('area3', 'am'): -0.01,
  ('area3', 'pm'): -0.001},
 '2016-11-29': {('area1', 'am'): -0.007,
  ('area1', 'pm'): 0.008,
  ('area2', 'am'): 0.0,
  ('area2', 'pm'): 0.0,
  ('area3', 'am'): -0.01,
  ('area3', 'pm'): -0.001}}

Adopting the answer linked to from @acushner.

dates = []
frames = []

for date, d in data.items():
    dates.append(date)
    frames.append(pd.DataFrame.from_dict(d, orient='index').stack())

pd.concat(frames, keys=dates, axis=1)

          2016-11-28  2016-11-29
area1 pm       0.008       0.008
      am      -0.007      -0.007
area2 pm       0.000       0.000
      am       0.000       0.000
area3 pm      -0.001      -0.001
      am      -0.010      -0.010
Ted Petrou
  • 59,042
  • 19
  • 131
  • 136