3

In Pandas, I have a DataFrame of observations (baby bottle feeding volumes) that are indexed by a datetime and grouped by date:

...
bottles = bottles.set_index('datetime')
bottles = bottles.groupby(bottles.index.date)

I want to use matplotlib to plot the cumulative values as they increase each day--that is, show the volume of feedings as it increases each day and resets at midnight:

ax = plt.gca()
ax.xaxis.set_major_locator(mdates.DayLocator())
ax.xaxis.set_minor_locator(mdates.HourLocator())
ax.xaxis.set_major_formatter(mdates.DateFormatter('%d-%m-%Y'))
bottles['volume'].cumsum().plot(kind='bar', figsize=[16,8])
ax.xaxis.grid(True, which="major")
ax.xaxis.grid(False, which="minor")
ax.yaxis.grid(True)
plt.gcf().autofmt_xdate()
plt.show()

Which produces: plot

I'd like to only label dates on the x-axis once per day, and I'd also like to only draw a vertical grid line on date boundaries (every 24 hours). Any recommendations for how to fix the above code?

Drew Dara-Abrams
  • 8,016
  • 11
  • 40
  • 48
  • pandas assumes that bar plots are categorical plots, so you're always going to have a tick for every bar shown. my guess is thgat you're gonna have to write plot against the matplotlib object directly and not through pandas' interface. – Paul H May 31 '16 at 15:45

1 Answers1

0

Since you didn't provide any data, I generated some dummy data. In essence, you can make the labels invisible by retrieving the ticks on the x-axis, and then making the hourly ticklabels visible.

Note: this works for hours, so resample your dataframe to hours if necessary.

import random
import pandas
import matplotlib.pyplot as plt

#generate dummy data and df
dates = pd.date_range('2017-01-01', '2017-01-10', freq='H')
df = pd.DataFrame(np.random.randint(0, 10, size=(1, len(dates)))[0], index=dates)
ax = df.groupby(pd.TimeGrouper('D')).cumsum().plot(kind='bar', width=1, align='edge', figsize=[16,8]) #cumsum with daily reset.
ax.xaxis.grid(True, which="major")
#ax.set_axisbelow(True)

#set x-labels to certain date format
ticklabels = [i.strftime('%D') for i in df.index]
ax.set_xticklabels(ticklabels)

#only show labels once per day (at the start of the day)
xticks = ax.xaxis.get_major_ticks()
n=24 # every 24 hours
for index, label in enumerate(ax.get_xaxis().get_ticklabels()):
    if index % n != 0:
        label.set_visible(False)  # hide labels
        xticks[index].set_visible(False)  # hide ticks where labels are hidden

ax.legend_.remove()
plt.show()

Result: Result

Chris
  • 1,287
  • 12
  • 31