1

I want to plot a histogram of datetime values with discrete values. I.e. the events per day for certain range of days. The entries are discrete, so I would like to ensure the binning always contains the same number of discrete values. Something along these lines.

## data is a pandas.DataFrame
## data['DTM'] contains datetimes

min_date = data.DTM.dt.date.min()
max_date = data.DTM.dt.date.max()

n_days = (max_date - min_date ).days

# each bin should contain one day
bins = np.arange(min_date, max_date + 1, 1)

data.DTM.dt.date.hist(range=(min_date, max_date), bins=bins)

this fails because max_date and min_date are datetime objects and np.arange needs integers. Is there a convenient function that does the same as range() or numpy.arange() with dates?

Or is there a more elegant way to solve this?

I tried pd.date_range(min_date, max_date + pd.Timedelta(1, unit='d')).date however, passing this to pd.data.hist() results in an error:

TypeError: '<' not supported between instances of 'float' and 'datetime.date'
Soerendip
  • 7,684
  • 15
  • 61
  • 128

1 Answers1

0

I found something that does to the trick. And it does not mess up the xticks! Thanks to Stev!

tmp = data[['DTM']].set_index('DTM')
tmp['Count'] = 1
tmp.groupby(pd.Grouper(freq='d')).count()
plt.bar(tmp.index, tmp.Count)
Soerendip
  • 7,684
  • 15
  • 61
  • 128