35

I am attempting to use matplotlib to plot some figures for a paper I am working on. I have two sets of data in 2D numpy arrays: An ascii hillshade raster which I can happily plot and tweak using:

import matplotlib.pyplot as pp
import numpy as np

hillshade = np.genfromtxt('hs.asc', delimiter=' ', skip_header=6)[:,:-1]

pp.imshow(hillshade, vmin=0, vmax=255)
pp.gray()
pp.show()

Which gives:

Hillshade

And a second ascii raster which delineates properties of a river flowing across the landscape. This data can be plotted in the same manner as above, however values in the array which do not correspond to the river network are assigned a no data value of -9999. The aim is to have the no data values set to be transparent so the river values overlie the hillshade.

This is the river data, ideally every pixel represented here as 0 would be completely transparent.

River data

Having done some research on this it seems I may be able to convert my data into an RGBA array and set the alpha values to only make the unwanted cells transparent. However, the values in the river array are floats and cannot be transformed (as the original values are the whole point of the figure) and I believe the imshow function can only take unsigned integers if using the RGBA format.

Is there any way around this limitation? I had hoped I could simply create a tuple with the pixel value and the alpha value and plot them like that, but this does not seem possible.

I have also had a play with PIL to attempt to create a PNG file of the river data with the no data value transparent, however this seems to automatically scale the pixel values to 0-255, thereby losing the values I need to preserve.

I would welcome any insight anyone has on this problem.

sgrieve
  • 632
  • 1
  • 8
  • 16

4 Answers4

58

Just mask your "river" array.

e.g.

rivers = np.ma.masked_where(rivers == 0, rivers)

As a quick example of overlaying two plots in this manner:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

# Generate some data...
gray_data = np.arange(10000).reshape(100, 100)

masked_data = np.random.random((100,100))
masked_data = np.ma.masked_where(masked_data < 0.9, masked_data)

# Overlay the two images
fig, ax = plt.subplots()
ax.imshow(gray_data, cmap=cm.gray)
ax.imshow(masked_data, cmap=cm.jet, interpolation='none')
plt.show()

enter image description here

Also, on a side note, imshow will happily accept floats for its RGBA format. It just expects everything to be in a range between 0 and 1.

Udi
  • 29,222
  • 9
  • 96
  • 129
Joe Kington
  • 275,208
  • 71
  • 604
  • 463
  • 1
    Brilliant! Thank you. I had hoped for an elegant solution like this, but was clearly barking up the wrong tree. – sgrieve Jun 18 '13 at 14:07
  • 2
    @MyCarta - It will work with as many arrays as you'd like, but your masking expressions may become complex. To combine boolean expressions with numpy arrays, you need to use `&` instead of `and`, `|` instead of `or`, `~` instead of `not`, etc. For example: `(a > 5) & (b > 5)` instead of `a > 5 and b > 5`. You'll also can't chain the operators in quite the same way, e.g. you need `(x > 2) & (x < 5)` instead of `2 > x > 5`. – Joe Kington Mar 11 '15 at 21:08
19

An alternate way to do this with out using masked arrays is to set how the color map deals with clipping values below the minimum of clim (shamelessly using Joe Kington's example):

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

# Generate some data...
gray_data = np.arange(10000).reshape(100, 100)

masked_data = np.random.random((100,100))

my_cmap = cm.jet
my_cmap.set_under('k', alpha=0)


# Overlay the two images
fig, ax = plt.subplots()
ax.imshow(gray_data, cmap=cm.gray)
im = ax.imshow(masked_data, cmap=my_cmap, 
          interpolation='none', 
          clim=[0.9, 1])
plt.show()

example

There as also a set_over for clipping off the top and a set_bad for setting how the color map handles 'bad' values in the data.

An advantage of doing it this way is you can change your threshold by just adjusting clim with im.set_clim([bot, top])

tacaswell
  • 84,579
  • 22
  • 210
  • 199
  • Great! This is a better solution for large arrays. I was having memory issues with @Joe Kington's solution – PPR Nov 17 '20 at 21:01
3

Another option is to set all cells which shall remain transparent to np.nan (not sure what's more efficient here, I guess tacaswell's answer based on clim will be the fastet). Example adapting Joe Kington's answer:

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.cm as cm

# Generate some data...
gray_data = np.arange(10000).reshape(100, 100)

masked_data = np.random.random((100,100))
masked_data[np.where(masked_data < 0.9)] = np.nan

# Overlay the two images
fig, ax = plt.subplots()
ax.imshow(gray_data, cmap=cm.gray)
ax.imshow(masked_data, cmap=cm.jet, interpolation='none')
plt.show()

enter image description here

Note that for arrays of dtype=bool you should not follow your IDE's advice to compare masked_data is True for the sake of PEP 8 (E712) but stick with masked_data == True for element-wise comparison, otherwise the masking will fail: enter image description here

F1iX
  • 823
  • 6
  • 12
0

Another strategy would be to export the river branch network. This is often an option in hydrologic flow routing tools. I'm not sure which tool you used here, but below is an example using the Python pysheds library, which requires a raster type file so using a tiff file instead of a 2D numpy array:

from pysheds.grid import Grid
import matplotlib.pyplot as plt
import numpy as np

# Instantiate grid from raster
grid = Grid.from_raster('test1.tif')
dem = grid.read_raster('test1.tif')

# Fill pits
pit_filled_dem = grid.fill_pits(dem)

# Fill depressions
flooded_dem = grid.fill_depressions(pit_filled_dem)

# Resolve flats and compute flow directions
inflated_dem = grid.resolve_flats(flooded_dem)
fdir = grid.flowdir(inflated_dem)

# Compute accumulation
acc = grid.accumulation(fdir)

# Extract river branch network for accumulations > 1000 units
branches = grid.extract_river_network(fdir, acc > 1000)

# Create fig and axes objects of matplotlib figure
fig, ax = plt.subplots()

# Set limits and aspect and tick formatting
plt.xlim(grid.bbox[0], grid.bbox[2])
plt.ylim(grid.bbox[1], grid.bbox[3])
ax.set_aspect('equal')
ax.ticklabel_format(style='sci', scilimits=(0,0), axis='both')

# Set the axes color to gray to demonstrate non-river pixels 
ax.set_facecolor('lightgray')

# Plot the river branch network using for loop
for branch in branches['features']:
    line = np.asarray(branch['geometry']['coordinates'])
    plt.plot(line[:, 0], line[:, 1], color='k')

River network visualization