I have a script which generates dataframes containing 'blocks' of dates and stores them in a variable 'cur_blocks' such as this one:
year start end date
gmt_reportedtime
2019-03-11 2000 2000-01-09 2000-03-10 2019-03-11
2019-03-11 2001 2001-01-09 2001-03-10 2019-03-11
2019-03-11 2002 2002-01-09 2002-03-10 2019-03-11
2019-03-11 2003 2003-01-09 2003-03-10 2019-03-11
2019-03-11 2004 2004-01-09 2004-03-10 2019-03-11
I also have another dataframe 'col' indexed on datetime which has data in it that I would like to do downstream calculations such as ranking and z-scores on. I would like to slice multiple chunks of col using the blocks denoted in 'start' and 'end' of 'cur_blocks'. The date blocks in 'cur_blocks' are 60 day windows each separated by 1 year between 2000-2019.
Here is 'col':
X
2000-01-01 24
2000-01-02 90
2000-01-03 62
2000-01-04 10
...
2019-02-28 73
2019-03-01 96
[7000 rows x 1 columns]
So far I have solved it like this:
window_aggregate = pd.DataFrame()
for index, block in cur_blocks.iterrows():
dt_block = col.loc[block['start']:block['end']]
window_aggregate = pd.concat([window_aggregate, dt_block])
I can then use 'window_aggregate' to do downstream work. This works, however it is very slow because 'cur_blocks' is generated for every single date between 2000-01-01 and today.
I am surprised that there isn't a built in function in pandas to do this kind of thing? Unless I missed it? But another user on stack overflow has also posed a similar question which so far hasn't been answered here
I recently posted a question detailing the broader problem here, although the code I've written mostly solves the problem it is far too slow for practical purposes.
I've seen it solved for a single day over several years here but not with a window.
There may be a solution here but I don't understand how to apply the answer to my data
EDIT: showing desired output something like below with the dates in the dataframe corresponding to the start and end blocks in cur_blocksDesired output
X
2000-01-09 24
2000-01-10 90
...
2000-03-10 62
2001-01-09 10
2001-01-10 10
...
2001-03-10 10
2002-01-09 10
...
...
...
2004-03-10 73