4

I have a Python histogram.

I want to normalise the peak of the histogram to 1 so that only the relative height of the bars is important.

I see some methods of doing this that involve changing the bin width, but I don't want to do this.

I also realise that I could just change the labels of the y axis, but I also have another plot overlaid, so the yticks must be the actual values.

Is there no way to access and change the histogram "count" in each bin?

Thank you.

user1551817
  • 6,693
  • 22
  • 72
  • 109

1 Answers1

5

I think what you're after is a normalized form of your histogram, where the y-axis is the density instead of the count. If you're using Numpy, just use the normed flag in the histogram function.

If you want the peak of your histogram to be 1, then you can divide the count in each bin by the maximum bin value, i.e. (building off the SO MatPlotLib example here):

#!/usr/bin/env python
import matplotlib.pyplot as plt
import numpy as np

# Generate random data
mu, sigma = 200, 25
x = mu + sigma*np.random.randn(10000)

# Create the histogram and normalize the counts to 1
hist, bins = np.histogram(x, bins = 50)
max_val = max(hist)
hist = [ float(n)/max_val for n in hist]

# Plot the resulting histogram
center = (bins[:-1]+bins[1:])/2
width = 0.7*(bins[1]-bins[0])
plt.bar(center, hist, align = 'center', width = width)
plt.show()

enter image description here

Community
  • 1
  • 1
mdml
  • 22,442
  • 8
  • 58
  • 66
  • But doesn't that normalise the area to 1 rather than the peak? – user1551817 Oct 02 '13 at 14:30
  • 1
    Yes, quite right. But it should let you compare the relative heights of the bars easily enough. If you do want the peak to be 1, just divide every bin by the value of the maximum bin. – mdml Oct 02 '13 at 14:39
  • Sorry if I'm being dumb, but that's my question.. I don't know how to divide the bins by a certain value. – user1551817 Oct 02 '13 at 14:42
  • 1
    I modified my answer to show how to divide the bins by the max bin value. – mdml Oct 02 '13 at 14:56
  • Thank you. The first line you have is max_val = max(bins), but 'bins' is not yet defined? – user1551817 Oct 02 '13 at 15:08
  • 1
    I modified the example to include generating the histogram. – mdml Oct 02 '13 at 15:09
  • Okay thank you. I have done as you suggested and now have the correct values in each bin. finally how do I "apply" these new bin values back into the histogram? Thank you! – user1551817 Oct 02 '13 at 15:50
  • let us [continue this discussion in chat](http://chat.stackoverflow.com/rooms/38493/discussion-between-mtitan8-and-user1551817) – mdml Oct 02 '13 at 16:20
  • How do I send output to a file if I'm on a server? Keep getting "Invalid DISPLAY variable." ... ? – Kevin J. Rice Apr 27 '16 at 23:38