1

Date, Brand, Indication,Geo, Type and values are column names Currently calculating rolling quarter using a function and datetime stamp the code used below is as follows, it is taking some time to execute the code is there any way to change or modify the codeRQ column is the rolling quarter column added..

 import pandas as pd
 from pandas import ExcelWriter
 from pandas import ExcelFile
 import datetime

#***Date parsing using datetime.stptime function***
dateparse = lambda x: pd.datetime.strptime(x, '%m/%d/%Y')
df = pd.read_csv('Demo for MAt.csv', index_col=0,
             parse_dates=['Date'], date_parser=dateparse)
## importing data from csv file as dataframe
#Function to calculate the rolling sum for each record by date and other 
  levels

def RQ(x):
    ##Function checks whether the date is falling in the previous 3 months range 
    ##and sums up if it is in the range**
    RQS = df['Value'][
          (df.index >= x.name - datetime.timedelta(days = 62)) 
        & (df.index <= x.name) 
        & (df['Brand'] == x[0]) 
        & (df['Indication'] == x[1])
        & (df['Geo'] == x[2]) 
        & (df['Type'] == x[3])
    ]
    return RQS.sum()

##For each row the calculation is done using the apply function**
df['RQ'] = df.apply(RQ, axis=1)


#the below data frames has the input and expected output for a sample
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
inputdf = pd.DataFrame([['04/01/2016', 1,'A','National','Value',10], 
['05/01/2016', 1,'A','National','Value',20], ['06/01/2016', 
1,'A','National','Value',30]], columns=['Date', 
'Brand','Indication','Geo','Type','Value'])
print inputdf
outputdf = pd.DataFrame([['04/01/2016', 1,'A','National','Value',10,10], 
['05/01/2016', 1,'A','National','Value',20,30], ['06/01/2016', 
1,'A','National','Value',30,60]], columns=['Date', 
'Brand','Indication','Geo','Type','Value','RQ'])
print outputdf
#Input**Below input**
       Date  Brand Indication       Geo   Type  Value
0  04/01/2016      1          A  National  Value     10
1  05/01/2016      1          A  National  Value     20
2  06/01/2016      1          A  National  Value     30
## Expected output
       Date  Brand Indication       Geo   Type  Value  RQ
0  04/01/2016      1          A  National  Value     10  10
1  05/01/2016      1          A  National  Value     20  30
2  06/01/2016      1          A  National  Value     30  60
PSR_1993
  • 13
  • 5
  • add your sample data as text, see here for how to ask a good question: https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples – Haleemur Ali May 30 '18 at 11:12
  • Hi Ali, I have added Sample data frames for input and the expected outputs below the actual code, gave text as string input in the data frame and also attached the expected output format, thanks for your suggestion @HaleemurAli – PSR_1993 May 30 '18 at 11:52
  • 1
    Hi Ali, Thanks for this, but I am getting an ValueError : window must be an integer @HaleemurAli, I assume code is working with sample data but if there different levels of same date ( duplicate entries) then it is not able to take '62D' as integer – PSR_1993 May 30 '18 at 15:16
  • see updated answer, setting the Date column as index should resolve it. – Haleemur Ali May 30 '18 at 16:04

1 Answers1

0

convert the Date column to timestamp type, if not already done & set it as index

df.Date = pd.to_datetime(df.Date)
df = df.set_index('Date')

group the data using the other dimensions, and for each group, apply a rolling sum on value.

DataFrame.rolling can create temporal windows, and defaults to using the index for windowing. specify 62D for the window size as you have done in your attempt.

df['RQ'] = df.groupby(list(df.columns[:-1].values)).Value.apply(lambda x: x.rolling('62D').sum())

This outputs (with the sample data):

            Brand Indication       Geo   Type  Value    RQ
Date
2016-04-01      1          A  National  Value     10  10.0
2016-05-01      1          A  National  Value     20  30.0
2016-06-01      1          A  National  Value     30  60.0
Haleemur Ali
  • 26,718
  • 5
  • 61
  • 85