86

I have a 1 dimensional array. I can compute the "mean" and "standard deviation" of this sample and plot the "Normal distribution" but I have a problem:

I want to plot the data and Normal distribution in the same figure.

I dont know how to plot both the data and the normal distribution.

Any Idea about "Gaussian probability density function in scipy.stats"?

s = np.std(array)
m = np.mean(array)
plt.plot(norm.pdf(array,m,s))
Michael Baudin
  • 1,022
  • 10
  • 25
Adel
  • 3,542
  • 8
  • 30
  • 31
  • 2
    what have you tried so far? Please post the code you have so far so we can answer specific questions about it. – Garth5689 Nov 15 '13 at 21:48

4 Answers4

180

You can use matplotlib to plot the histogram and the PDF (as in the link in @MrE's answer). For fitting and for computing the PDF, you can use scipy.stats.norm, as follows.

import numpy as np
from scipy.stats import norm
import matplotlib.pyplot as plt


# Generate some data for this demonstration.
data = norm.rvs(10.0, 2.5, size=500)

# Fit a normal distribution to the data:
mu, std = norm.fit(data)

# Plot the histogram.
plt.hist(data, bins=25, density=True, alpha=0.6, color='g')

# Plot the PDF.
xmin, xmax = plt.xlim()
x = np.linspace(xmin, xmax, 100)
p = norm.pdf(x, mu, std)
plt.plot(x, p, 'k', linewidth=2)
title = "Fit results: mu = %.2f,  std = %.2f" % (mu, std)
plt.title(title)

plt.show()

Here's the plot generated by the script:

Plot

unutbu
  • 842,883
  • 184
  • 1,785
  • 1,677
Warren Weckesser
  • 110,654
  • 19
  • 194
  • 214
  • 1
    Is it possible to do this without scipy module? – maro Jan 12 '18 at 12:51
  • @maro Yes. Fitting the normal distribution is pretty simple. You can replace `mu, std = norm.fit(data)` with `mu = np.mean(data); std = np.std(data)`. You'll have to implement your own version of the PDF of the normal distribution if you want to plot that curve in the figure. – Warren Weckesser Jan 12 '18 at 16:46
  • 1
    Is there a reason why use `x = np.linspace(xmin, xmax, 100)` instead of `x = np.sort(data)`? The latter seems more reasonable. – Anastasiya-Romanova 秀 Jul 18 '19 at 08:42
  • 5
    @Anastasiya-Romanova秀, the original data might be sparse in some areas and dense in others, making the plot of the PDF jagged. By using linspace, you can sample the PDF on a regularly spaced grid and make the samples as dense as you like, resulting in a nice smooth plot of the PDF. – Warren Weckesser Jul 18 '19 at 13:03
13

A simple alternative it to use seaborn (<= 0.11.2):

import numpy as np
import seaborn as sns
from scipy.stats import norm

# Generate simulated data
n_samples = 100
rng = np.random.RandomState(0)
data = rng.standard_normal(n_samples)

# Fit Gaussian distribution and plot
sns.distplot(data, fit=norm, kde=False)
4

To see both the normal distribution and your actual data you should plot your data as a histogram, then draw the probability density function over this. See the example on https://docs.scipy.org/doc/numpy-1.15.0/reference/generated/numpy.random.normal.html for exactly how to do this.

Community
  • 1
  • 1
YXD
  • 31,741
  • 15
  • 75
  • 115
1

There is a much simpler way to do it using seaborn:

import seaborn as sns
from scipy.stats import norm

data = norm.rvs(5,0.4,size=1000) # you can use a pandas series or a list if you want

sns.distplot(data)
plt.show()

output:

enter image description here

for more information:seaborn.distplot

LonelyDaoist
  • 665
  • 8
  • 22