I'm creating a single scatter plot with about 500 files, each a few hundred megabytes. I want all the points from each file to be a certain color corresponding to a single value (a float) in the file's metadata.
I can't find a way of first setting the color range for the whole plot, and then setting the color of a given plt.scatter instance to a value in that range. It seems that no matter what, matplotlib wants to choose a color for each point from an iterable of the same size as the data. This is not practical for my application, as creating a single array for all of my data would be several gigabytes.
A pseudo-codey thing that I'd like to do is along the lines of:
for file in files:
val = get_metadata(file)
data = np.genfromtxt(file)
color_range = [c_min, c_max]
plt.scatter(data[:,0],
data[:,1],
color_range = color_range,
c = val)
plt.show()
Does anyone know of a matplotlib way to do this? I really haven't been able to find it in the documentation.
It was suggested that this is a duplicate of this question, but it's slightly different. The solution offered there,
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm
x = np.arange(10)
ys = [i+x+(i*x)**2 for i in range(10)]
colors = cm.rainbow(np.linspace(0, 1, len(ys)))
for y, c in zip(ys, colors):
plt.scatter(x, y, color=c)
iterates through the defined colors sequentially. I want to define my color map as above, but then select from within that color map with the metadata value, which is a continuous float within a certain range.