i want to draw a fairly small IoT-CSV-Dataset, about ~2gb. It has the following dimensions (~20.000, ~18.000). Each column should become a subplot, with it's own y axis. I use the following code to generate the picture:
times = pd.date_range('2012-10-01', periods=2000, freq='2min')
timeseries_array = np.array(times);
cols = random.sample(range(1, 2001), 2000)
values = []
for col in cols:
values.append(random.sample(range(1,2001), 2000))
time = pd.DataFrame(data=timeseries_array, columns=['date'])
graph = pd.DataFrame(data=values, columns=cols, index=timeseries_array)
fig, axarr = plt.subplots(len(graph.columns), sharex=True, sharey=True,
constrained_layout=True, figsize=(50,50))
fig.autofmt_xdate()
for i, ax in enumerate(axarr):
ax.plot(time['date'], graph[graph.columns[i]].values)
ax.set(ylabel=graph.columns[i])
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
myFmt = mdates.DateFormatter('%d.%m.%Y %H:%M')
ax.xaxis.set_major_formatter(myFmt)
ax.label_outer()
print('--save-fig--')
plt.savefig(name, dpi=500)
plt.close()
But this is so incredible slow, for 100 subplots it took ~1 min, for 2000 around 20 min. Well my machine has 10 cores and 35 gb ram actually. Have you any hints for me to speed up the process? Is it possible to do multithreading? As i can see this only use one core. Are there some tricks to only draw relevant things? Or is there an alternative method to draw this plot faster, all in one figure without subplots?