0

I have 3-dimensional data (dtype=np.float64) that I want to plot in a scatter plot using two of the dimensions for space and one for color, i.e. plt.scatter(data[0], data[1], c=data[2]). I can do this and get a plot for the whole data, however, I want to create an animation of the data using portions of the entire data that was read at different times. So, if I have 1000 data points read during the first time period, I would take the first 1000 points and plot them, using c=data[2][0:1000]. Similarly, I would do this for the next timestep, and so on.

My problem occurs when I try to define a global colormap and use the entire dataset to determine the color, but I get the error, ValueError: 'c' argument has 10000 elements, which is inconsistent with 'x' and 'y' with size 1000. How do I define a global colormap for the entirety of data[2], then assign colors to each point in each scatter plot from the global colormap?

I don't care about the actual colors used (they can be red-blue, green-orange, whatever). My only requirement is that the color bar be smooth.

I have tried normalizing the data with temp = plt.Normalize() and use c = temp in my loop, but the same error message pops up.

Here's a minimum (non)working example that throws the error:

import numpy as np
import matplotlib.pyplot as plt

totaldata = np.random.uniform(-100, 100, size=(3, 10000)) #some generated data
data = [totaldata[0][0:1000], totaldata[1][0:1000], totaldata[2][0:1000]]
plt.scatter(data[0], data[1], c=totaldata[2], cmap='viridis', s=1)

I want the colormap to be based on the entirety of the third axis of the variable totaldata, and apply it to scatterplots of the slices of the total data.

jared
  • 4,165
  • 1
  • 8
  • 31
requiemman
  • 61
  • 6
  • Please provide a [minimum working example](https://stackoverflow.com/help/minimal-reproducible-example) that we can run and test. You can use random number if the actual values don't matter, but it should show the error that you're encountering. – jared Jun 14 '23 at 19:19
  • Added a minimum working example and explained the error – requiemman Jun 14 '23 at 19:29
  • For `c`, you used `totaldata[2]`. Did you mean to do `data[2]`? With that change it works for me. – jared Jun 14 '23 at 19:36
  • I want to use totaldata[2]. That's my question - I want a consistent colormap (defined from the total data) applied to slices of the total data. – requiemman Jun 14 '23 at 19:48
  • Well, you cannot use `totaldata[2]` since that doesn't have the right shape, which you clearly see. If I'm understanding correctly, you want to be able to use `data[2]` but with the colormap that would have been dictated by the entire data range, is that correct? Do you know the minimum and maximum possible values you will have in `totaldata[2]`? – jared Jun 14 '23 at 19:50
  • Yes, I know the minimum and maximum values I will have in totaldata[2]. You understood my question: I want the colormap dictated by the total data range. Essentially, I am plotting data[0] and data[1] on a scatter plot, but the third 'dimension' of each point is determined by data[2], and the color assigned to the third 'dimension' is determined by totaldata[2]. – requiemman Jun 14 '23 at 19:53
  • I have a solution for you, but I can't post it because someone power tripped and decided there wasn't enough information. There is though. – K. Shores Jun 14 '23 at 19:56
  • If that is the case, then I think you can use one of the two options in this answer: https://stackoverflow.com/a/47699278/12131013. You will put your known data range for `vmin` and `vmax` or inside `plt.Normalize(vmin, vmax)` and you should get the desired colors based on the expected range of values. – jared Jun 14 '23 at 19:57
  • @K.Shores maybe post it on pastebin or similar? – requiemman Jun 14 '23 at 19:57
  • Find the min and max of your data, `data_min, data_max = np.min(data), np.max(data)`, create a norm that will map values of your data between 0 and 1. `import matplotlib as mpl`, `norm = mpl.colors.Normalize(vmin=data_min, vmax=data_max)`. Then map each subset of the third dimension for the `c` parameter with the norm. – K. Shores Jun 14 '23 at 19:58
  • @requiemman Maybe this will suit your needs. https://pastebin.com/MiYvc96F – K. Shores Jun 14 '23 at 20:02
  • @K.Shores yes, it did! The only thing I'm working on is that generating each plot is done with a loop (as the number of slices vary) so ``plt.subplots()`` isn't very useful to me. But this is the solution. – requiemman Jun 14 '23 at 20:12
  • @requiemman if they re-open the question, I can post something that makes an animation. – K. Shores Jun 14 '23 at 20:15
  • @K.Shores doesn't look like they're reopening it, and I'm running into another problem where for some reason, norm=norm(totaldata[2]) etc (i.e. the norm argument inside scatter() throws a 'norm must be an instance of Normalize, not a MaskedArray' error. I've tried removing NaN's but run into the same error. Is this an error caused by trying to animate it? – requiemman Jun 14 '23 at 21:08
  • `norm(totaldata[2])` creates a masked array. `norm=norm` may be what you want – K. Shores Jun 14 '23 at 21:33
  • 1
    @K.Shores that gives every point the same color. – requiemman Jun 14 '23 at 21:42
  • 2
    @K.Shores edit: fixed. Restarting the kernel solved the problem – requiemman Jun 14 '23 at 22:24

1 Answers1

0

enter image description here

Because my initial answer (please see below) has not been recognized as a proper answer to OP's question (I swear it is... :-), I will reformulate it in more straightforward terms with respect to the question's details and minutiae.

import numpy as np
import matplotlib.pyplot as plt
np.random.seed(20230621)

extremes = (-100, 100)
norm = plt.Normalize(*extremes) # aka (min(totdata[2], max(totdata[2])
totdata = np.random.uniform(*extremes, size=(3, 10000))
data = totdata[:,:1000]

plt.scatter(data[0], data[1], c=data[2],
    cmap='viridis', norm=norm,
    s=20, alpha=0.8)    # larger dots, more transparent
plt.colorbar(alpha=0.8) # apply alpha also to the colorbar
plt.show()

Initial Answer


Use an ad hoc norm.

enter image description here

import numpy as np
import matplotlib.pyplot as plt

np.random.seed(20230619)
mat = np.random.randint(0, 20, (40, 40))

norm = plt.Normalize(-10, 30)
fig, (ax0, ax1)  = plt.subplots(ncols=2, figsize=(9, 6), layout='constrained')
im0 = ax0.imshow(mat-10, norm=norm)
im1 = ax1.imshow(mat+10, norm=norm)
plt.colorbar(im1, ax=[ax0, ax1], location='bottom')

plt.show()
gboffi
  • 22,939
  • 8
  • 54
  • 85