0

I have two data arrays for which I plot a histogram using pyplot:

data1 = numpyArray1
data2 = numpyArray2

They do not have the same size, so I use the option density=True to compare them properly. I'm also letting pyplot select the bins automatically, as the data is float and I'm not (unless it's absolutely necessary) creating the limits automatically.

fig, ax = plt.subplots(....)
ax[...].hist([data1, data2], bins = 30, density = True, histtype='step)

Example:

plot

Questions:

  1. Can I assume that the bins are exactly the same for both distributions?
  2. How can I see (or even better, get) the automatic bin limits that pyplot created? (This question assumes the bins are integer numbers, not valid for me)
  3. (optinal) Can I get the intersection point of the two curves somehow? (This question assumes gaussian distribution, which doesn't solve my problem)
Daniel Möller
  • 84,878
  • 18
  • 192
  • 214
  • In your code above you are passing 30 bins. The default value of bins is 10 as per [this](https://matplotlib.org/tutorials/introductory/customizing.html#matplotlib-rcparams) link. – Sheldore Mar 11 '19 at 20:23

1 Answers1

1

From the docs, hist will return

n : array or list of array. The values of the histogram bins.

bins : The edges of the bins.

patches : Silent list of individual patches used to create the histogram or list of such list if multiple input datasets.

So use:

freqs, bins, _ = ax[...].hist([data1, data2], bins = 30, density = True, histtype='step)

In answer to your questions:

Can I assume that the bins are exactly the same for both distributions?

Yes, as they are returned in the same call

How can I see (or even better, get) the automatic bin limits that pyplot created? (This question assumes the bins are integer numbers, not valid for me)

with bins (see code)

(optinal) Can I get the intersection point of the two curves somehow? (This question assumes gaussian distribution, which doesn't solve my problem)

Get the frequencies and normalize them, then see when one crosses the other. Example (with freq defined as above):

freqA, freqB = freq
freqA /= freqA.sum()
freqB /= freqB.sum()
ix = np.diff(np.sign(f[0] - f[1])).nonzero()   # check where sign changes (curves cross)
intersections = (bins[ix] + bins[ix + 1]) / 2
Community
  • 1
  • 1
Tarifazo
  • 4,118
  • 1
  • 9
  • 22
  • Thank you :) -- Do you know what patches are? – Daniel Möller Mar 11 '19 at 20:45
  • 1
    They are `matplotlib.pyplot` objects (https://matplotlib.org/api/patches_api.html ), you can use them to access custom properties to, say, change the color of one of the bars. But most of the times you won't use them at all. – Tarifazo Mar 12 '19 at 13:03