I am plotting stock OHLC data to mplfinance and customising the x-axis amd to do so I pass an array of indicies from my dataframe, this however is question to refactor my code and use pandas efficiently.
I loop through the data and every 15 minute aligned timestamp (i.e. on-the-hour or 15/30/45 minutes-past) I would want to print a label/tick.
Instead of using external arrays (tickLocations
and tickLabels
arrays in my code below), I would look to add an additional column from this data called XAxisTick
and set True
where a tick is required (please scroll right to show the added column):
Timestamp Volume VW Open Close High Low Time Num XAxisTick
2021-12-21 05:00:00-05:00 1370.0 22.5 22.5 22.5 22.5 22.5 1640080800000 6 True
2021-12-21 05:15:00-05:00 2790.0 22.3211 22.32 22.32 22.33 22.32 1640081700000 11 True
2021-12-21 05:30:00-05:00 3000.0 22.231 22.27 22.22 22.27 22.22 1640082600000 11 True
2021-12-21 05:35:00-05:00 202.0 22.1713 22.16 22.16 22.16 22.16 1640082900000 4 False
2021-12-21 06:00:00-05:00 125.0 22.17 22.17 22.17 22.17 22.17 1640084400000 1 True
2021-12-21 06:05:00-05:00 10000.0 22.2 22.2 22.2 22.2 22.2 1640084700000 17 False
2021-12-21 06:10:00-05:00 8969.0 22.1004 22.12 22.1 22.12 22.1 1640085000000 22 False
2021-12-21 06:15:00-05:00 4409.0 22.0664 22.1 22.06 22.1 22.06 1640085300000 7 True
2021-12-21 06:20:00-05:00 8300.0 22.0437 22.06 22.01 22.06 22.01 1640085600000 10 False
2021-12-21 06:25:00-05:00 13865.0 21.9568 22.06 21.91 22.06 21.91 1640085900000 31 False
2021-12-21 06:35:00-05:00 4000.0 21.9555 21.94 21.96 21.96 21.94 1640086500000 9 False
2021-12-21 06:40:00-05:00 2594.0 21.92 21.92 21.92 21.92 21.92 1640086800000 4 False
2021-12-21 06:45:00-05:00 751.0 21.9513 21.97 21.91 21.97 21.91 1640087100000 5 True
2021-12-21 06:50:00-05:00 306.0 21.9325 21.9 21.95 21.95 21.9 1640087400000 4 False
2021-12-21 06:55:00-05:00 300.0 21.9333 21.96 21.92 21.96 21.92 1640087700000 2 False
2021-12-21 07:00:00-05:00 13979.0 21.9129 21.96 21.93 21.96 21.9 1640088000000 60 True
Additionally I would like to have this adjustable for different time spans, i.e. data that's normally spaced every 15 minutes to have ticks every 60 minutes, or hourly data to have ticks every 4 hours (but keeping with the above, only if a data point exists there for the label). Please note i am not talking about resampling that dataframe will always be and plotted as-is, I am looking only to define where ticks are created via the XAxisTick
boolean depending on the tick generation strategy.
Once I get the above, I believe Pandas would then enable me to do more advanced filtering, like ignoring labels in plot image below at 05:00/05:15/05:30
squished on the plot due to minimal spacing, in that case would want at least 1 non-label row with XAxisTick=false
prior - thus dropping the label at 05:15
but leaving 05:00
and 05:30
.
This label filtering is what set my off in this jouney, however my pandas skills are lacking - thanks in advance for the time and assistance.
import mplfinance as mpf
import pandas as pd
import datetime
def getTimestampTickFrequency(df):
# get most common interval in minutes
mode = df.index.to_series().diff().astype('timedelta64[m]').astype('Int64').mode()[0]
if mode==5:
return 15 # for 5 minutes, tick every 15 mins
elif mode==15:
return 60 # for 15 minute data, tick every hour
elif mode==30:
return 120 # for 30 minute data, tick every 2 hours
return mode
def getTickLocationsAndLabels(df):
tickLocations = []
tickLabels = []
tickFrequencyMinutes = getTimestampTickFrequency(df)
entireTickRange = pd.date_range(start=df.index[0], end=df.index[-1], freq=f'{tickFrequencyMinutes}T')
# get indexes of data frame that match the ticks, if they exist
for tick in entireTickRange:
print(tick)
try:
found = df.index.get_loc(tick)
tickLocations.append(found)
tickLabels.append(tick.time().strftime('%H:%M'))
except KeyError:
pass # ignore
return tickLocations, tickLabels
df5trimmed = pd.read_csv('https://pastebin.com/raw/SgpargBb', index_col=0, parse_dates=True)
tickLocations, tickLabels = getTickLocationsAndLabels(df5trimmed)
fig, axlist = mpf.plot(df5trimmed,style='yahoo', figsize=(48,24), type='candlestick', volume=True, tight_layout=True, returnfig=True)
axlist[-2].xaxis.set_ticks(tickLocations)
axlist[-2].set_xticklabels(tickLabels)
# Display:
mpf.show()