Assign specific colors to values of an array when plotting it using imshow without creating a secondary array

Question

I have a 2D NumPy array containing binary values (0 and 1) +1 additional value (-999) representing "no data".
Here's a dummy example:

import numpy as np

arr = np.random.randint(2, size=(100,))
# this no-data value has to be large compared to the data range:
arr[np.random.randint(1, 100, 20)] = -999 
arr = arr.astype(np.float32)
arr.reshape((10,10))

>> arr
array([[   1,    1,    0,    1,    0,    0, -999,    1,    0,    0],
       [   0,    1,    0,    0,    1,    0,    1,    1,    1,    1],
       [   1,    1,    1,    1, -999,    0,    1, -999,    1, -999],
       [   0,    1, -999,    1,    0,    1,    1,    1,    0,    1],
       [-999, -999,    1,    1,    1,    0,    0,    0,    0,    1],
       [-999,    1,    1, -999,    0,    1,    1, -999, -999,    1],
       [   1,    1,    1,    0,    0,    1,    1, -999, -999,    0],
       [-999,    0,    1,    0, -999, -999,    1,    1,    1,    1],
       [   1,    1,    0,    1,    1,    0,    0,    0,    1,    0],
       [   1, -999, -999,    0,    1,    0,    1,    0,    1,    1]],
       dtype=float32)

When I plot this array using matplotlib.imshow(), as 0 and 1 pixels are very close together compared to the whole data range, they have almost the same color, which makes them totally indistinguishable:

import matplotlib.pyplot as plt

plt.imshow(arr)

in yellow, the pixels with values of 0 and 1, and in dark purple the -999 pixels.

I'd like to map these pixel values to the following:

color_list = {-999: np.array([189, 184, 181]), # gray tone
                 0: np.array([0, 255, 0]),     # green
                 1: np.array([255, 180, 180])} # light red

This seems to work:

data_3d = np.ndarray(shape=(arr.shape[0], arr.shape[1], 3), dtype=int)
for i in range(0, arr.shape[0]):
    for j in range(0, arr.shape[1]):
        data_3d[i][j] = color_map[arr[i][j]]

plt.imshow(data_3d)

The masked version is also working:

# Normalizing the color_dict is mandatory to avoid a ValueError:
# "RGBA values should be within 0-1 range"
color_dict = {-999: np.array([189, 184, 181])/.255, # gray tone
                 0: np.array([0, 255, 0])/.255,     # green
                 1: np.array([255, 180, 180])/.255} # light red

mask = np.ma.masked_where(arr == -999, arr)

# Here the indices in square brackets are the actual values of the pixels
# which serves as keys in the color_dict:
# see https://stackoverflow.com/questions/26610389/defining-a-binary-matplotlib-colormap
customcm = mpl.colors.ListedColormap( [ color_list[0], color_list[1] ] )
customcm.set_bad( color_list[-999] )

plt.imshow(mask, cmap=customcm)

But I wonder if one can totally avoid the double for loop and/or the creation of a new array (including a mask) here?
For example, dynamically, by only using matplotlib, because some of my arrays are quite large (up to ~2GB) and I would like to keep the memory consumption as low as possible by avoiding building another array of the same size with up to 3 channels...

Versioning:

>>> numpy.__version__
'1.19.5'

>>> mpl.__version__
'3.3.4'

@Alex you should post this comment as an answer – RandomGuy Jul 15 '21 at 09:14 — RandomGuy, Jul 15 '21 at 09:14

Alex · Answer 1 · 2021-07-15T09:38:10.077

Here's an example using ListedColormap and np.nan.

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap

arr = np.array(
    [
        [1, 1, 0, 1, 0, 0, -999, 1, 0, 0],
        [0, 1, 0, 0, 1, 0, 1, 1, 1, 1],
        [1, 1, 1, 1, -999, 0, 1, -999, 1, -999],
        [0, 1, -999, 1, 0, 1, 1, 1, 0, 1],
        [-999, -999, 1, 1, 1, 0, 0, 0, 0, 1],
        [-999, 1, 1, -999, 0, 1, 1, -999, -999, 1],
        [1, 1, 1, 0, 0, 1, 1, -999, -999, 0],
        [-999, 0, 1, 0, -999, -999, 1, 1, 1, 1],
        [1, 1, 0, 1, 1, 0, 0, 0, 1, 0],
        [1, -999, -999, 0, 1, 0, 1, 0, 1, 1],
    ],
    dtype="float",
)
arr[arr < 0] = np.nan

cmap = ListedColormap(["green", "orangered"])
cmap.set_bad("silver")
plt.imshow(arr, cmap=cmap)

Which produces:

Assign specific colors to values of an array when plotting it using imshow without creating a secondary array

1 Answers1