I have daily data in a dataframe (ds) that looks like this spanning several years:
For each day, I need all of the quotes normalized to a specific time's of that particular day. For example, on June 1st, I need all the quotes normalized to the quote at 3pm June 1st, but on June 2nd, I need them all normalized to the quote at 3pm of June 2nd.
I have a dataframe of all the 3pm quotes daily, as shown here:
So I thought it would work if I could simply divide my quotes dataframe (ds) by the 3pm dataframe:
dr = ds.groupby(ds.index.date).apply(lambda x: x / b)
Of note, the 3pm dataframe includes more days than the quotes dataframe (ds). As a result, the quotes dataframe would have to divide through the correct respective days in the 3pm dataframe, probably using something like b.index.date
I've also tried not using a separate dataframe, and instead pulling out the last available quote from the quote dataframe (ds) itself - this did not work either:
EDIT: Thanks for the advice DSM. Does this help at all? Not sure if I should pd.to_string() the dataframe because that doesn't seem easily clipboard-readable either.
Quotes ds dataframe:
bid ask mid
2000-01-04 14:45:12+00:00 281.0 281.5 281.25
2000-01-04 14:46:10+00:00 281.0 282.0 281.5
2000-01-04 14:47:14+00:00 281.2 282.2 281.7
2000-01-04 14:47:22+00:00 281.25 281.85 281.55
2000-01-04 14:47:47+00:00 281.25 281.75 281.5
2000-01-04 14:48:09+00:00 281.4 281.9 281.65
2000-01-04 14:48:40+00:00 281.3 282.3 281.8
2000-01-04 14:49:40+00:00 281.3 281.8 281.55
2000-01-04 14:49:45+00:00 281.2 282.2 281.7
2000-01-04 14:50:53+00:00 281.4 281.9 281.65
3pm dataframe:
bid_close ask_close price
2000-01-04 15:00:00+00:00 281.35 281.95 281.65
2000-01-05 15:00:00+00:00 280.73 281.48 281.105
2000-01-06 15:00:00+00:00 279.7 280.3 280.0
2000-01-07 15:00:00+00:00 282.3 282.9 282.6
2000-01-10 15:00:00+00:00 281.7 282.3 282.0
2000-01-11 15:00:00+00:00 282.1 282.7 282.4
2000-01-12 15:00:00+00:00 281.9 282.5 282.2
2000-01-13 15:00:00+00:00 281.9 282.7 282.3
2000-01-14 15:00:00+00:00 283.15 283.75 283.45
2000-01-17 15:00:00+00:00 285.5 286.0 285.75
And the command:
c = ds.groupby(ds.index.date).apply(lambda x: x / x.between_time('14:30:00', '14:59:59').resample('30Min', how='last').dropna(how='all'))
I've also tried (which seems close):
df = a.groupby(a.index.date).apply(lambda x: x / x.between_time('14:45:00', '14:59:59').tail(1))
Here's a link that seems related (and what I based the command above off of): Grouping daily data by month in python/pandas and then normalizing