I've got a lot of stock data from which I want to build a OHLC chart (Open, High, Low, Close) similar to this one:
For this I make use of the candlestick function from matplotlib.finance. I've got minute data, so in order to also make intraday charts I used this other stackoverflow thread, which evenly spaces all the candlesticks to avoid gaps between the days (since there is no data between 5:30 PM and 9:00 AM). This works reasonably well with a small amount of data, but unfortunately I've got A LOT of data (510 minutes per day * 10 years ≈ 1 million minutes). From this data I want to make a chart which gives an overview, but which can also be zoomed in so that I can see individual minutes for specific days in history.
In order to do this I thought of simply making a huge (very wide) figure, saving it, and then I can simply zoom in with any image viewer.
I now have some code which I copied from the other thread and adjusted a bit. In the code I first create some random sample data (because the original sample data in the other thread was deleted). From this data I then draw up a figure with a fixed candlestick width and a figure width of the amount of minutes times the individual candlesticks width (len(data)*candlestickWidth). The code and the resulting image are below, which is with only 2 days and 15 minutes per day (so a total of 30 minutes).
My question is now: how do I position the candlesticks so that they have no spaces in between the sticks and so that the image width is dependent on the amount of sticks so that it gets wider as I add more minutes?
All tips are welcome!
import numpy as np
import matplotlib.pyplot as plt
import datetime
import random
from matplotlib.finance import candlestick
from matplotlib.dates import num2date, date2num
# Create sample data for 5 days. Five columns: time, opening, close, high, low
jaar = 2007
maand = 05
data = np.array([[1.0,1.0,1.0,1.0,1.0]])
quotes = [(5, 6, 7, 4), (6, 9, 9, 6), (9, 8, 10, 8), (8, 6, 9, 5), (8, 11, 13, 7)]
for dag in range(5, 7):
for uur in range(9, 10):
for minuut in range(15):
numdatumtijd = date2num(datetime.datetime(jaar, maand, dag, uur, minuut))
koersdata = quotes[random.randint(0,4)]
data = np.append(data, [[numdatumtijd, koersdata[0], koersdata[1], koersdata[2], koersdata[3], ]], axis=0)
data = np.delete(data, 0, 0)
print('Ready with building sample data')
# determine number of days and create a list of those days
ndays = np.unique(np.trunc(data[:,0]), return_index=True)
xdays = []
for n in np.arange(len(ndays[0])):
xdays.append(datetime.date.isoformat(num2date(data[ndays[1],0][n])))
# creation of new data by replacing the time array with equally spaced values.
# this will allow to remove the gap between the days, when plotting the data
data2 = np.hstack([np.arange(data[:,0].size)[:, np.newaxis], data[:,1:]])
# plot the data
candlestickWidth = 0.2
figWidth = len(data) * candlestickWidth
fig = plt.figure(figsize=(figWidth, 5))
ax = fig.add_axes([0.05, 0.1, 0.9, 0.9])
# customization of the axis
ax.spines['right'].set_color('none')
ax.spines['top'].set_color('none')
ax.xaxis.set_ticks_position('bottom')
ax.yaxis.set_ticks_position('left')
ax.tick_params(axis='both', direction='out', width=2, length=8, labelsize=12, pad=8)
ax.spines['left'].set_linewidth(2)
ax.spines['bottom'].set_linewidth(2)
# set the ticks of the x axis only when starting a new day
ax.set_xticks(data2[ndays[1],0]) ## (Also write the code to set a tick for every whole hour)
ax.set_xticklabels(xdays, rotation=45, horizontalalignment='right')
ax.set_ylabel('Quotes', size=20)
# Set limits to the high and low of the data set
ax.set_ylim([min(data[:,4]), max(data[:,3])])
# Create the candle sticks
candlestick(ax, data2, width=candlestickWidth, colorup='g', colordown='r')
plt.show()