17

I would like to add a density plot to my histogram diagram. I know something about pdf function but I've got confused and other similar questions were not helpful.

from scipy.stats import * 
from numpy import*
from matplotlib.pyplot import*
from random import*

nums = []
N = 100
for i in range(N):
    a = randint(0,9)
    nums.append(a)

bars= [0,1,2,3,4,5,6,7,8,9]
alpha, loc, beta=5, 100, 22

hist(nums,normed= True,bins = bars)


show()

I'm looking for something like this

enter image description here

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
aaa
  • 161
  • 1
  • 1
  • 8
  • 4
    You might be interested in seaborn's [``kdeplot``](https://stanford.edu/~mwaskom/software/seaborn/generated/seaborn.kdeplot.html) function. – jakevdp Oct 26 '15 at 05:00
  • See https://stackoverflow.com/questions/33203645/how-to-plot-a-histogram-using-matplotlib-in-python-with-a-list-of-data/33203848#33203848 – Sergey Bushmanov Oct 25 '21 at 18:38

3 Answers3

24
from scipy import stats
import numpy as np
import matplotlib.pyplot as plt
np.random.seed(41)

N = 100
x = np.random.randint(0, 9, N)
bins = np.arange(10)

kde = stats.gaussian_kde(x)
xx = np.linspace(0, 9, 1000)
fig, ax = plt.subplots(figsize=(8,6))
ax.hist(x, density=True, bins=bins, alpha=0.3)
ax.plot(xx, kde(xx))

plot

cel
  • 30,017
  • 18
  • 97
  • 117
  • Shouldnt it be ax.plot(xx, kde) on last line instead? – 00__00__00 Dec 01 '17 at 07:27
  • @ErroriSalvo, think of a `kde` as a fitted function. In the last line we evaluate `kde` at all positions in the array `xx`. It's similar to plotting a quadratic function: `plot(x, lambda x: x**2)` – cel Dec 01 '17 at 08:43
  • 1
    in "plt, ax = plt.subplots(figsize=(8,6))", you might want to replace "plt" with "fig" on the LHS. – Solomon Vimal Oct 03 '19 at 00:04
  • @SolomonVimal, thanks for catching that. That was a typo! – cel Oct 03 '19 at 06:48
  • 1
    After version 3.1.1 or so, `normed` keyword is no longer there, should use `density` instead. [commit link](https://github.com/matplotlib/matplotlib/commit/c015c4aae7338b58f3aa677e57fb82e8755ecc6e#diff-9e2655db1b2a6796cb9bacd9400e3ac063f5c8de8f6c81537df2591629e6fb5aR56) – Mustafa Aydın Feb 04 '21 at 07:20
4

Here's a solution using seaborn 0.11.1 and pandas 1.1.5:

import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import numpy as np

N = 100
nums = [np.random.randint(i-i, 9) for i in range(N)]
df = pd.DataFrame(nums, columns=["value"])

fig, ax1 = plt.subplots()
sns.kdeplot(data=df, x="value", ax=ax1)
ax1.set_xlim((df["value"].min(), df["value"].max()))
ax2 = ax1.twinx()
sns.histplot(data=df, x="value", discrete=True, ax=ax2)

enter image description here

Note how I use numpy to generate the random values because I need actual values, not generators. The discrete=True in the last line assures that the ticks are centered.

MERose
  • 4,048
  • 7
  • 53
  • 79
2

distplot from Seaborn offers histogram plot as well as distribution graph together:

sns.distplot(df)
Mario
  • 1,631
  • 2
  • 21
  • 51