You could subsample the lists using
idx = np.random.choice(np.arange(len(x)), num_samples)
plt.scatter(x[idx], y[idx])
However, this leaves the result a bit up to random luck. We can do better by making a heatmap. plt.hexbin
makes this particularly easy:
plt.hexbin(x, y)
Here is an example, comparing the two methods:
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.colors as mcolors
np.random.seed(2015)
N = 10**5
val1 = np.random.normal(loc=10, scale=2,size=N)
val2 = np.random.normal(loc=0, scale=1, size=N)
fig, ax = plt.subplots(nrows=2, sharex=True, sharey=True)
cmap = plt.get_cmap('jet')
norm = mcolors.LogNorm()
num_samples = 10**4
idx = np.random.choice(np.arange(len(val1)), num_samples)
ax[0].scatter(val1[idx], val2[idx])
ax[0].set_title('subsample')
im = ax[1].hexbin(val1, val2, gridsize=50, cmap=cmap, norm=norm)
ax[1].set_title('hexbin heatmap')
plt.tight_layout()
fig.colorbar(im, ax=ax.ravel().tolist())
plt.show()
