1

I have an (x,y)-scatter plot, where each point is associated with a color. Some points, however, do not have a valid color, and are assigned NaN. I would like to include these points, but show them in a color not contained by the colormap.

Here's the example code:

import numpy as np
import matplotlib.colors as mcol
import matplotlib.pyplot as plt

numPoints = 20
nanFrequency = 3
xVec = np.arange(numPoints, dtype=float)
yVec = xVec
colorVec = np.linspace(0,1,numPoints)
colorVec[range(0, numPoints, nanFrequency)] = np.nan

colormap = mcol.LinearSegmentedColormap.from_list("Blue-Red-Colormap", ["b", "r"])

plt.scatter(xVec, yVec, c=colorVec, cmap=colormap)

and the corresponding output: Scatter plot not showing points with invalid color

Every third point is not shown due to its invalid color value. Based on my code, I would have expected these points to be shown in yellow. Why doesn't this work?

Note that there's a related post concerning imshow(), from which the above code is inspired. The solution presented there does not seem to work for me.

Many thanks in advance.

Unis
  • 614
  • 1
  • 5
  • 17

2 Answers2

3

Of course you need to set the desired yellow to your colormap, colormap.set_bad("yellow").

Then, this is a long standing bug in matplotlib (#4354), which fortunately has now been fixed (#12422).

So from matplotlib 3.1 onwards, you can use the plotnonfinite=True argument to include masked or invalid points in scatter plots.

import numpy as np
import matplotlib.colors as mcol
import matplotlib.pyplot as plt

numPoints = 20
nanFrequency = 3
xVec = np.arange(numPoints, dtype=float)
yVec = xVec
colorVec = np.linspace(0,1,numPoints)
colorVec[range(0, numPoints, nanFrequency)] = np.nan

colormap = mcol.LinearSegmentedColormap.from_list("Blue-Red-Colormap", ["b", "r"])
colormap.set_bad("yellow")

plt.scatter(xVec, yVec, c=colorVec, cmap=colormap, plotnonfinite=True)

plt.show()

enter image description here

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
1

The reason that your NaN values are not plotted is that matplotlib's scatter currently filters them out before giving them to the colormap.

To show the NaN entries you can manually assign them a dummy value with a special meaning. For example, because your list is in the range [0, 1] you could define that any value > 1 get a special color. For this you will have to fix the range of the color-axis, and specify a color for entries outside this range (in this case higher than the maximum).

Basically you will use:

cax = ax.scatter(...)
cax.cmap.set_over('y') # assigns yellow to any entry >1
cax.set_clim(0, 1)     # fixes the range of 'normal' colors to (0, 1)

For your example:

import numpy as np
import matplotlib as mpl
import matplotlib.pyplot as plt

numPoints = 20
nanFrequency = 3
xVec = np.arange(numPoints, dtype=float)
yVec = xVec
colorVec = np.linspace(0,1,numPoints)
colorVec[range(0, numPoints, nanFrequency)] = np.NaN

cmap = mpl.colors.LinearSegmentedColormap.from_list("Blue-Red-Colormap", ["b", "r"], numPoints)

# ---

fig, axes = plt.subplots(nrows=2, figsize=(8, 2*6))

# ---

ax = axes[0]

ax.scatter(xVec, yVec, c=colorVec, cmap=cmap)

ax.set_xlim([0, 20])
ax.set_ylim([0, 20])

# ---

ax = axes[1]

colorVec[np.isnan(colorVec)] = 2.0

cax = ax.scatter(xVec, yVec, c=colorVec, cmap=cmap)
cax.cmap.set_over('y')
cax.set_clim(0, 1)

ax.set_xlim([0, 20])
ax.set_ylim([0, 20])

# ---

plt.show()

Which produces two subplots: the top corresponds to what you had, the bottom uses the dummy value and assigns yellow to it:

enter image description here

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
Tom de Geus
  • 5,625
  • 2
  • 33
  • 77