Although there are many matplotlib optimization posts around, I didn't find the exact tips I want here, such as: Matplotlib slow with large data sets, how to enable decimation?
Matplotlib - Fast way to create many subplots?
My problem is that I have cached CSV files of time-series data (40 of them). I'd like to plot them in one plot with 40 subplots in a vertical series, and output them to a single rasterized image.
My code using matplotlib is as follows:
def _Draw(self):
"""Output a graph of subplots."""
BigFont = 10
# Prepare subplots.
nFiles = len(self.inFiles)
fig = plt.figure()
plt.axis('off')
for i, f in enumerate(self.inFiles[0:3]):
pltTitle = '{}:{}'.format(i, f)
colorFile = self._GenerateOutpath(f, '_rgb.csv')
data = np.loadtxt(colorFile, delimiter=Separator)
nRows = data.shape[0]
ind = np.arange(nRows)
vals = np.ones((nRows, 1))
ax = fig.add_subplot(nFiles, 1, i+1)
ax.set_title(pltTitle, fontsize=BigFont, loc='left')
ax.axis('off')
ax.bar(ind, vals, width=1.0, edgecolor='none', color=data)
figout = plt.gcf()
plt.savefig(self.args.outFile, dpi=300, bbox_inches='tight')
The script hangs for the whole night. On average my data are all ~10,000 x 3 to ~30,000 x 3 matrix.
In my case, I don't think I can use memmapfile to avoid memory hog because the subplot seems to be the problem here, not the data imported each loop.
I have no idea where to start to optimize this workflow. I could, however, forget about subplots and generate one plot image per data at a time, and stitch the 40 images later, but that is not ideal.
Is there an easy way in matplotlib to do this?