I have a dataframe like this:
close
formated
2017-01-03 09:30:00 29.9713
2017-01-03 09:31:00 29.0622
2017-01-03 09:32:00 29.0750
2017-01-03 09:33:00 29.0276
2017-01-03 09:34:00 29.0375
... ...
2022-08-19 09:30:00 173.5500
2022-08-19 09:31:00 173.4494
2022-08-19 09:32:00 173.3400
2022-08-19 09:33:00 173.3900
2022-08-19 09:34:00 173.2600
The df contains a price value for every minute of the day starting from 9:30 through 16:00. I am looking to get the maximum possible profit for each minute of the day.
I am currently doing something like this:
df['Profit'] = (df.groupby(pd.Grouper(freq='D'))['close'].transform('max') - df['close']).div(df['close'])
This gives me the percentage of the profit for each row to the highest value of the day.
However, this approach is flawed, because it also calculates the profit percentage from timestamps after the highest value of the day has already been reached.
But i don't want the max value for the whole day, but rather the max value per day with only the timestamps that come after the row in the dataframe that we are currently looking at.
Profit is defined as the difference between the value at that minute, and the maximum value across all following minutes in the same day.
Desired Output:
close Profit abs. Profit perc.
formated
2017-01-03 09:30:00 29.9713 0.0 0.0
2017-01-03 09:31:00 29.0622 0.0128 0.0004404
2017-01-03 09:32:00 29.0750 0.0 0.0
2017-01-03 09:33:00 29.0276 0.0099 0.0003410
2017-01-03 09:34:00 29.0375 0.0 0.0
... ... ... ...
Please no solutions that iterate over the dataframe in a for loop as this is extremely slow.