2

I need to do a 2d density-like plot. However I calculate the "densities" myself. So essentially I have an NxM array of values that I can only plot with plt.matshow (or imshow).

fig, ax = plt.subplots()
im = ax.matshow(value_array)
ax.set_xticklabels(x_edges - 2.5)
ax.set_yticklabels(y_edges - 0.25)

However, in this case, the axis values are the pixels in the plot, whereas I really want it to show some user-defined values. So I manually change the tick labels as above.

This still leaves a problem. matshow still thinks the tick labels are labelling the "pixels" in the image, so the tick labels are printed in the "middle" of each pixel square. However, like I said, what I'm really trying to plot is more like a density plot, so each "pixel square" represents a bin in x,y space. It would make a lot more sense to have the tick labels printed on the square edges, like the way it's done for histogram plots and frequency plots in general.

Should I keep using matshow for this or is there another function that does this? For example, can I use the plt.hist2d but manually set the "heights" without entering data as a bunch of samples? Otherwise, how do I make plt.matshow put the tick labels in the way I want them?

Marses
  • 1,464
  • 3
  • 23
  • 40
  • Is the `extent` option of `imshow` what you are looking for? – Tom de Geus Feb 25 '19 at 10:20
  • @TomdeGeus I don't think so. As far as I can understand, extent would only affect the range of the image (like xlim or ylim), though I may have misunderstood how it works. Basically in the documenation on extent, https://matplotlib.org/tutorials/intermediate/imshow_extent.html, you see in the plots that the axis actually starts at -0.5, and goes to like 6.5, while the tick labels are in the centres of the pixels, at 0.0, 1.0, etc... So if I could offset the tick locations by 0.5, it would do what I want (or better shift the whole image by -0.5 pixels). – Marses Feb 25 '19 at 10:29

2 Answers2

3

I'm not sure that I understand you correctly. What I understand is that you want to take a 2-d histogram of your data, and want to show the count/density of each bin using a color, while retaining the real coordinates of the bin-edges.

Indeed you can use a combination of numpy.histogram2d and matplotlib.pyplot.imshow.

Let me start by a warning. Withimshow you display pixels. Implicitly you therefore assume that bins are uniformly sized along each axis. They may have a different width and height, but the width/height of each bin has be equal for the representation to be fair.

To achieve what I think you want you'll have to use something like this:

import matplotlib.pyplot as plt
import numpy as np

N = 100000
x = np.random.randn(N)
y = np.random.weibull(2.,N)

P, xedges, yedges = np.histogram2d(x, y, bins=(np.linspace(-4,+5,10), np.linspace(0,4,21)), density=True)

fig, ax = plt.subplots()

cax = ax.imshow(P.T, extent=(xedges[0], xedges[-1], yedges[0], yedges[-1]),
  origin='lower', interpolation='nearest', clim=(0,.4), cmap='afmhot_r')

cbar = fig.colorbar(cax,aspect=10)

ax.set_aspect('auto')

ax.set_xlabel('x')
ax.set_ylabel('y')

plt.savefig('test.png')
plt.show()

Which plots

enter image description here

The tricky part is that to get a natural output:

  • You have to overwrite imshow's default of putting the origin of the top of the image. As indicated, this you do with the origin='lower' option.
  • You have to plot the transposed output of numpy.histogram, because imshow shows the matrix as-is, while the output of numpy.histogram show has shape (nx, ny): the values along the x-axis correspond to rows.
  • You might have to change the aspect ratio, see this answer.
Tom de Geus
  • 5,625
  • 2
  • 33
  • 77
  • Thanks for the answer, I think it's working, but I've still got a few issues (also I'm using matshow in case it's different). First of all, my x axis ranges from 0 to 40, my y axis from 0.5 to 3.5, so just like your picture, the plot is very compressed vertically. Is this expected, and should I just try to stretch out using some figure params then? Secondly: I'm not sure what I'm doing wrong, but it's not putting the full range of values on the y axis (i.e I manually set the tick labels, but it puts one on every second tick, and overall it's got half the range I'd expect). – Marses Feb 25 '19 at 21:57
  • @LimokPalantaemon With respect to the aspect ratio: I have updated my answer, including `set_aspect`. With respect to the ticks, it's difficult to say without seeing the actual code and the problem. You could consider opening a new question? – Tom de Geus Feb 26 '19 at 07:09
  • hey, I've been looking into setting the aspect, but it doesn't work with a simple `set_aspect('equal').` Moreover, I'm starting to be convinced that this was a bit more complicated a question than I first thought. Setting the extent, and your answer, helped, but I think there is another step to make after this. I'm thinking of keeping the idea of 'pixels' in the extent and not stretching the figure; instead I will try to relabel the ticks directly by position. I will post an answer below (will still keep yours as the answer). A combination of both might help others. – Marses Feb 26 '19 at 14:17
  • 1
    @LimokPalantaemon You should set it to **auto** not equal. Or you should manually specify the aspect ratio. – Tom de Geus Feb 26 '19 at 15:08
  • Ah I see, yes it works with `auto`. Sorry my bad. This is definitely the better way to do it. Thank you. – Marses Feb 26 '19 at 15:21
0

As an addition to @TomdeGeus answer, here is something that could help. Since I needed to plot a figure where the y-axis stretches from 0.5 to 3.5 while the x-axis stretches from 0 to 40, it was very compressed in the image, and I may have had to force the aspect ratio, which wasn't working. There was also something wrong with the ticks it displaying.

However after fixing the aspect ratio, I would definitely recommend you follow Tom de Geus's answer, it's the correct way to do this.

So I still plot the image on "pixel" coordinates, i.e choose the extent so that the x- and y-axes count the pixels, but starting at 0, rather than -0.5 like the default behaviour of plt.imshow()

fig, ax = plt.subplots()

im = ax.matshow(value_grid, origin='lower', extent=(0, len(x_edges)-1, 0, len(y_edges)-1)

Where len(y_edges) - 1 counts the number of pixels I want in the y-axis (and y_edges is a list containing the values of the bin boundaries I want to display on the y-axis as before.

Then I manually replace the tick labels, but I also need to correctly map them to the right ticks.

ax.set_xticks(list(range(len(x_edges))))
ax.set_xticklabels(x_edges)
ax.set_yticks(list(range(len(y_edges))))
ax.set_yticklabels(y_edges)

This preserves the square nature of the pixels produced by imshow, however you have to keep in mind that the underlying axis is still defined in terms of pixels (i.e if I want to place a point on the coordinate (25.0, 2.0), it wouldn't actually look like it ended up at that location below.

enter image description here

Marses
  • 1,464
  • 3
  • 23
  • 40
  • I don't think this is correct in general: you make strong assumptions on your `x_edges` and `y_edges` (or in fact you are plotting the index, not their coordinate as you indicated you wanted). – Tom de Geus Feb 26 '19 at 15:10
  • Sorry, not sure what you mean. I do concede this is definitely a weaker way, and it only gets done the specific thing I wanted. As you say, I'm plotting by index but then I'm just relabeling the tick-labels to display what I intended. That's why I'm leaving your answer as the successful one. – Marses Feb 26 '19 at 15:18
  • Now I see. What in my solution was still not working for you?\ – Tom de Geus Feb 26 '19 at 17:23
  • Now that the aspect ratio is fixed it's fine. Another thing was that I didn't notice that I was still relabelling my tick labels like I do in the question, so that's also why the tick labels seemed wrong. I would say now it's fine. – Marses Feb 26 '19 at 21:04