1

I want to use the pd.DataFrame.sum with apply. However, the signature seems inoperative

I started here: python pandas: apply a function with arguments to a series, to understand what it took to pass parameters to a function using apply. I tried the answer which seems more suitable (the third) and still the use of arguments seem inoperative.

indexes = ['2017-09-01 01:15:00', '2017-09-01 01:30:00', 
           '2017-09-01 01:54:00', '2017-09-01 01:59:00', 
           '2017-09-01 02:15:00', '2017-09-01 02:30:00', 
           '2017-09-01  02:54:00', '2017-09-01 02:59:00', 
           '2017-09-01 05:15:00', '2017-09-01 05:30:00', 
           '2017-09-01  05:54:00', '2017-09-01 05:59:00']
values_A = [1, 3, 4, 3, 5, 6, 3, 3, 9, 1, 9, 8]
values_B = [1, 3, 4, 3, 5, 6, 3, 3, 9, 2, 6, 3]

table = pd.DataFrame({'datetime' : indexes, 'A' : values_A, 'B' :   values_B})
table['datetime'] = pd.to_datetime(table['datetime'])
table.set_index('datetime', inplace=True)
table.sort_index(inplace=True)

What I wanted (and obtain using

    table.groupby([pd.Grouper(freq='60Min', base=0)]).sum(skipna=True)

):

2017-09-01 01:00:00     11.0    11.0
2017-09-01 02:00:00     17.0    17.0
2017-09-01 03:00:00     NaN     NaN
2017-09-01 04:00:00     NaN     NaN
2017-09-01 05:00:00     27.0    20.0

What I get (using

table.groupby([pd.Grouper(freq='60Min',base=0)]).apply(pd.Series.sum, skipna = True):

2017-09-01 01:00:00     11.0    11.0
2017-09-01 02:00:00     17.0    17.0
2017-09-01 03:00:00     0.0     0.0
2017-09-01 04:00:00     0.0     0.0
2017-09-01 05:00:00     27.0    20.0
ALollz
  • 57,915
  • 7
  • 66
  • 89
Albionion
  • 53
  • 7

1 Answers1

0

It is not really a solution. But it a way to circumvent the problem. If I do

table['hour'] = table.index.hour
table.groupby([pd.Grouper(freq='60Min', base=0), 'hour']).apply(pd.Series.sum, skipna = True)

The hours that are not present will be eliminated. However, this does not explain the observed behavior.

Albionion
  • 53
  • 7