-1

This is a follow-up to this question: Pandas limit Series/DataFrame to range of values of one column

I'd next like to histogram the numbers in the "Age" column and then smooth the result (to reduce scatter). What's an elegant way to do this?

Community
  • 1
  • 1
Alpha
  • 45
  • 3
  • 9
  • Why don't you try it yourself? It's easy. Look up Matplotlib's hist function, or pandas's hist function. – Kartik Feb 24 '16 at 05:03
  • [Density plot](http://stackoverflow.com/questions/4150171/how-to-create-a-density-plot-in-matplotlib), [Plotting probability](http://stackoverflow.com/questions/15415455/plotting-probability-density-function-by-sample-with-matplotlib), or maybe you just want `df['Age'].plot(kind='density')` or `df['Age'].hist()`. I'm personally unclear on what you mean by "then smooth the result". – Jarad Feb 24 '16 at 05:53
  • It's easy to make a histogram, but I'm having some formatting issues when trying to smooth over it. It seems like the only way is to produce a new function function from the histogram and smooth/interpolate over it using scipy's interpolate, but I thought there may be a more pythonic way to do it, e.g., with a pandas-native function? – Alpha Feb 24 '16 at 21:21

1 Answers1

4

You can use Seaborn and its function distplot which plot by default a kernel density estimate and histogram with bin size determined automatically.

import seaborn as sns
import numpy as np
import pandas as pd

# Some test data
np.random.seed(33454)
df = pd.DataFrame({'nb': np.random.randint(0, 1000, 100)})
df.sort_values('nb', inplace=True)

ax = sns.distplot(df['nb'])

enter image description here

Romain
  • 19,910
  • 6
  • 56
  • 65