4

I have a dataframe with dates as columns. I'd like to average the values from daily to a monthly level. I've tried with Time Grouper and Resample, but it doesn't like the columns names are strings and I can seem to figure out how to make columns into something like a DatetimeIndex.

My starting dataframe:

import pandas as pd

df = pd.DataFrame(data=[[1,2,3,4],[5,6,7,8]],
                  columns=['2013-01-01', '2013-01-02', '2013-02-03', '2013-02-04'], 
                  index=['A', 'B'])

Desired Output:

   2013-01-01  2013-02-01
A         1.5         3.5
B         5.6         7.5
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
user2242044
  • 8,803
  • 25
  • 97
  • 164

5 Answers5

8

You can using resample

df.columns = pd.to_datetime(df.columns)
df.T.resample('M').mean().T
Out[409]: 
   2013-01-31  2013-02-28
A         1.5         3.5
B         5.5         7.5

Or groupby one

axis=1 
df.groupby(pd.to_datetime(df.columns).to_period('M'),1).mean()
Out[412]: 
   2013-01  2013-02
A      1.5      3.5
B      5.5      7.5
BENY
  • 317,841
  • 20
  • 164
  • 234
4

First, convert column index to datetime with pd.to_datetime, then use T and groupby with pd.Grouper (note pd.TimeGerouper is deprecated use pd.Grouper):

df.columns = pd.to_datetime(df.columns)
df.T.groupby(by=pd.Grouper(freq='MS')).mean().T

Output:

   2013-01-01  2013-02-01
A         1.5         3.5
B         5.5         7.5
Scott Boston
  • 147,308
  • 15
  • 139
  • 187
4

you can make use of pd.PeriodIndex:

In [145]: df.groupby(pd.PeriodIndex(df.columns, freq='M'), axis=1).mean()
Out[145]:
   2013-01  2013-02
A      1.5      3.5
B      5.5      7.5
MaxU - stand with Ukraine
  • 205,989
  • 36
  • 386
  • 419
1

Try converting the column names to date first:

df = pd.DataFrame(data=[[1,2,3,4],[5,6,7,8]], columns=pd.to_datetime(['2013-01-01', '2013-01-02', '2013-02-03', '2013-02-04']), index=['A', 'B'])

Hope it helps!

1
import pandas as pd

list=df.columns
df_new = pd.DataFrame()

for i in range(int(0.5*len(list))):
    df_new[list[2*i]] = (df[[list[2*i], list[2*i+1]]].mean(axis=1))

Output

       2013-01-01  2013-02-03
A         1.5         3.5
B         5.5         7.5

I dont understand your Desired Output:

Manuel
  • 698
  • 4
  • 8
  • it's the monthly mean. – stucash Dec 07 '17 at 23:10
  • Yes, for '2013-01-01' the average is 1.5 ((1+2)/2) and 5.5 ((5+6)/2), but for 2013-02-01 the average is 2.5 ((2+3)/2) and 6.5 ((6+7)/2), and you use 3.5 ((3+4)/2) and 7.5 ((7+8)/2). – Manuel Dec 07 '17 at 23:15
  • it's not daily. it's monthly. have a look below answers, the index is on 'Month'. i.e. Jan and Feb. January, column A has an avarege of (1+2)/2 = 1.5. February, column A has an average of (3+4)/2 = 3.5. you get the rest I believe. – stucash Dec 07 '17 at 23:24