2

I would like to get histogram values from a DataFrame:

%matplotlib inline
import pandas as pd
import numpy as np

df=pd.DataFrame(60*np.random.sample((100, 4)), pd.date_range('1/1/2014',periods=100,freq='D'), ['A','B','C','D'])

Taking into account pd.cut() it is possible to do it with only one column, as in example:

bins=np.linspace(0,60,5)
df.groupby(pd.cut(df.A,bins)).count()

Is it possible to get whole histogram values for all columns in one DataFrame? The desired output would look like this:

            A   B   C   D       
(0, 15]     21  10  1   2
(15, 30]    14  24  21  24
(30, 45]    10  0   22  30
(45, 60]    25  5   25  25
Community
  • 1
  • 1
Michal
  • 1,927
  • 5
  • 21
  • 27

1 Answers1

2

How about this technique, essentially list comphrension and a pd.concat()

np.random.seed(1)    
bins=np.linspace(0,60,5)
df=  pd.concat([df[x].groupby(pd.cut(df[x],bins)).count() for x in df.columns],axis=1)
df.index.names = [None]
print df

which for me produces:

           A   B   C   D

(0, 15]   26  20  31  23
(15, 30]  23  23  20  18
(30, 45]  24  32  24  29
(45, 60]  27  25  25  30
Dickster
  • 2,969
  • 3
  • 23
  • 29