9

Is there a way to tell matplotlib to "normalize" a histogram such that its area equals a specified value (other than 1)?

The option "normed = 0" in

n, bins, patches = plt.hist(x, 50, normed=0, histtype='stepfilled')

just brings it back to a frequency distribution.

Edwin
  • 2,074
  • 1
  • 21
  • 40
Pawin
  • 368
  • 1
  • 3
  • 7

2 Answers2

11

Just calculate it and normalize it to any value you'd like, then use bar to plot the histogram.

On a side note, this will normalize things such that the area of all the bars is normed_value. The raw sum will not be normed_value (though it's easy to have that be the case, if you'd like).

E.g.

import numpy as np
import matplotlib.pyplot as plt

x = np.random.random(100)
normed_value = 2

hist, bins = np.histogram(x, bins=20, density=True)
widths = np.diff(bins)
hist *= normed_value

plt.bar(bins[:-1], hist, widths)
plt.show()

enter image description here

So, in this case, if we were to integrate (sum the height multiplied by the width) the bins, we'd get 2.0 instead of 1.0. (i.e. (hist * widths).sum() will yield 2.0)

Joe Kington
  • 275,208
  • 71
  • 604
  • 463
8

You can pass a weights argument to hist instead of using normed. For example, if your bins cover the interval [minval, maxval], you have n bins, and you want to normalize the area to A, then I think

weights = np.empty_like(x)
weights.fill(A * n / (maxval-minval) / x.size)
plt.hist(x, bins=n, range=(minval, maxval), weights=weights)

should do the trick.

EDIT: The weights argument must be the same size as x, and its effect is to make each value in x contribute the corresponding value in weights towards the bin count, instead of 1.

I think the hist function could probably do with a greater ability to control normalization, though. For example, I think as it stands, values outside the binned range are ignored when normalizing, which isn't generally what you want.

James
  • 3,191
  • 1
  • 23
  • 39