0

I faced a serious problem when I was trying to add colorbar to scatter plot which indicates in which classes individual sample belongs to. The code works perfectly when classes are [0,1,2] but when the classes are for example [4,5,6] chooses colorbar automatically color values in the end of colormap and colorbar looks blue solid color. I'm missing something obvious but I just can't figure out what it is.

Here is the example code about the problem:

import matplotlib.pyplot as plt
import numpy as np
fig, ax = plt.subplots(1 , figsize=(6, 6))
plt.scatter(datapoints[:,0], datapoints[:,1], s=20, c=labels, cmap='jet', alpha=1.0)
plt.setp(ax, xticks=[], yticks=[])
cbar = plt.colorbar(boundaries=np.arange(len(classes)+1)-0.5)
cbar.set_ticks(np.arange(len(classes)))
cbar.set_ticklabels(classes)
plt.show()

Variables can be for example

datapoints = np.array([[1,1],[2,2],[3,3],[4,4],[5,5],[6,6],[7,7]])
labels = np.array([4,5,6,4,5,6,4])
classes = np.array([4,5,6])

Correct result is got when

labels = np.array([0,1,2,0,1,2,0])

In my case I want it to work also for classes [4,5,6]

T. Holmström
  • 153
  • 1
  • 12

1 Answers1

1

The buoundaries need to be in data units. Meaning, if your classes are 4,5,6, you probably want to use boundaries of 3.5, 4.5, 5.5, 6.5.

import matplotlib.pyplot as plt
import numpy as np

datapoints = np.array([[1,1],[2,2],[3,3],[4,4],[5,5],[6,6],[7,7]])
labels = np.array([4,5,6,4,5,6,4])
classes = np.array([4,5,6])


fig, ax = plt.subplots(1 , figsize=(6, 6))
sc = ax.scatter(datapoints[:,0], datapoints[:,1], s=20, c=labels, cmap='jet', alpha=1.0)
ax.set(xticks=[], yticks=[])
cbar = plt.colorbar(sc, ticks=classes, boundaries=np.arange(4,8)-0.5)

plt.show()

enter image description here

If you wanted to have the boundaries determined automatically from the classes, some assumption must me made. E.g. if all classes are subsequent integers,

boundaries=np.arange(classes.min(), classes.max()+2)-0.5

In general, an alternative would be to use a BoundaryNorm, as shown e.g. in Create a discrete colorbar in matplotlib or How to specify different color for a specific year value range in a single figure? (Python) or python colormap quantisation (matplotlib)

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
  • What about if classes are [4,6,7]? Could one option to replace labels with low values and define new labels for color information in scatter plot? – T. Holmström Mar 17 '19 at 16:44
  • It actually worked by replacing color information with np.unique(z, return_inverse=True)[1].tolist(). Thanks for the help! – T. Holmström Mar 17 '19 at 16:49
  • `np.unique(z, return_inverse=True)[1].tolist()` maps the classes to successive integer numbers. That is sufficiently different from what I'm proposing here and more similar to the linked answers using a boundary norm, to either deserve its own answer. No need to accept this one if at the end you use a different solution; but showing that solution to others (here `z` isn't even defined`) in a new answer would make sense. – ImportanceOfBeingErnest Mar 17 '19 at 17:29
  • For me this answer helped to figure out what is it all about and I will accept the answer – T. Holmström Mar 17 '19 at 17:31