0

Suppose we have a function (like pdf of a normal distribution), and we want to approximate it with histograms under the function. I wanna specify the number of bins and draw histograms under the curve. How is it possible to do in Python? For example, a graph like below, but all spikes are under the curve, and the number of bins is a parameter.

enter image description here

Amin
  • 127
  • 8
  • 1
    What have you tried? SO is not a coding service. Please read the following documentation, then [edit], and rephrase the question. [Take the Tour](https://stackoverflow.com/tour), [How to ask a good question](https://stackoverflow.com/help/how-to-ask), & [On Topic](https://stackoverflow.com/help/on-topic). Always provide a [mre] with **code, data, errors, current & expected output, as [formatted text](https://stackoverflow.com/help/formatting)** & **you're expected to [try to solve the problem first](https://meta.stackoverflow.com/questions/261592) and show your effort**. – Trenton McKinney Dec 31 '21 at 19:35
  • Thank you JohanC. That is exactly what I am looking for. – Amin Jan 01 '22 at 06:08

1 Answers1

2

You can use the pdf to decide the heights of the bars:

from scipy.stats import norm
import numpy as np
N = 20
x = np.linspace(norm.ppf([0.001, 0.999]), N)
y = norm.pdf(x)

Each center of a bar will be just as high as the pdf, so the bars will cut the curve. To only touch the curve, one could calculate the pdf at the lowest point, being x + width/2 for positive points. As the pdf is symmetric, abs can be used to create a single expression for both positive and negative x-values.

Here is an animation created via the celluloid library.

import matplotlib.pyplot as plt
import numpy as np
import scipy
from scipy.stats import norm
from celluloid import Camera

fig, ax = plt.subplots(figsize=(8, 2))
fig.subplots_adjust(bottom=0.15, left=0.1, right=0.97)
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.spines['bottom'].set_visible(False)
camera = Camera(fig)
x0, x1 = norm.ppf([0.001, 0.999])
x_pdf = np.linspace(x0, x1, 1000)
y_pdf = norm.pdf(x_pdf)
for N in range(10, 80):
    ax.plot(x_pdf, y_pdf, 'r', lw=2)
    x_bar = np.linspace(x0, x1, N)
    width = x_bar[1]-x_bar[0]
    y_bar = norm.pdf(np.abs(x_bar) + width/2)
    ax.bar(x_bar, y_bar, width=width, fc='DeepSkyBlue', ec='k')
    ax.margins(x=0)
    ax.set_ylabel('probability density')
    camera.snap()
animation = camera.animate(interval=600)
animation.save('gaussian_histogram.gif')
plt.show()

histogram along gaussian pdf

PS: Here is a list of related questions (collected by @TrentonMcKinney), where you can find additional explanation and ideas:

JohanC
  • 71,591
  • 8
  • 33
  • 66