7

I want to resample a DataFrame with a multi-index containing both a datetime column and some other key. The Dataframe looks like:

import pandas as pd
from StringIO import StringIO

csv = StringIO("""ID,NAME,DATE,VAR1
1,a,03-JAN-2013,69
1,a,04-JAN-2013,77
1,a,05-JAN-2013,75
2,b,03-JAN-2013,69
2,b,04-JAN-2013,75
2,b,05-JAN-2013,72""")

df = pd.read_csv(csv, index_col=['DATE', 'ID'], parse_dates=['DATE'])
df.columns.name = 'Params'

Because resampling is only allowed on datatime indexes, i thought unstacking the other index column would help. And indeed it does, but i cant stack it again afterwards.

print df.unstack('ID').resample('W-THU')

Params      VAR1      
ID               1     2
DATE                    
2013-01-03      69  69.0
2013-01-10      76  73.5

But then stacking 'ID' again results in an index-error:

print df.unstack('ID').resample('W-THU').stack('ID')

IndexError: index 0 is out of bounds for axis 0 with size 0

Strangely enough, i can stack the other column level with both:

print df.unstack('ID').resample('W-THU').stack(0)

and

print df.unstack('ID').resample('W-THU').stack('Params')

The index-error also occurs if i reorder (swap) both column levels. Does anyone know how to overcome this issue?

Engineero
  • 12,340
  • 5
  • 53
  • 75
Rutger Kassies
  • 61,630
  • 17
  • 112
  • 97

1 Answers1

8

The example unstacks a non-numerical column 'NAME' which is silently dropped but causes problems during re-stacking. The code below worked for me

print df[['VAR1']].unstack('ID').resample('W-THU').stack('ID')
Params         VAR1
DATE       ID
2013-01-03 A   69.0
           B   69.0
2013-01-10 A   76.0
           B   73.5
user1827356
  • 6,764
  • 2
  • 21
  • 30
  • Thanks! I noticed the dropping, but it never caused any problems in other situations so i didnt think that would be the problem. – Rutger Kassies Mar 26 '13 at 07:57