0

Several years ago I had problems plotting a bar chart without overlapping the date labels. I received an answer that worked.

Today I encountered the same situation, but that old solution won't work in my real example (it does continue to work in the toy example.) I cannot figure out the difference between my real example and the toy.

The toy index is a datetime64[ns]

DatetimeIndex(['2017-01-01', '2017-01-02', '2017-01-03', '2017-01-04',
               '2017-01-05', '2017-01-06', '2017-01-07', '2017-01-08',
               '2017-01-09', '2017-01-10', '2017-01-11', '2017-01-12',
               '2017-01-13', '2017-01-14', '2017-01-15', '2017-01-16',
               '2017-01-17', '2017-01-18', '2017-01-19', '2017-01-20',
               '2017-01-21', '2017-01-22', '2017-01-23', '2017-01-24',
               '2017-01-25', '2017-01-26', '2017-01-27', '2017-01-28',
               '2017-01-29', '2017-01-30', '2017-01-31', '2017-02-01',
               '2017-02-02', '2017-02-03', '2017-02-04', '2017-02-05',
               '2017-02-06', '2017-02-07', '2017-02-08', '2017-02-09',
               '2017-02-10', '2017-02-11'],
              dtype='datetime64[ns]', freq=None)

which looks like my real index:

DatetimeIndex(['2011-01-01', '2011-02-01', '2011-03-01', '2011-04-01',
               '2011-05-01', '2011-06-01', '2011-07-01', '2011-08-01',
               '2011-09-01', '2011-10-01',
               ...
               '2019-03-01', '2019-04-01', '2019-05-01', '2019-06-01',
               '2019-07-01', '2019-08-01', '2019-09-01', '2019-10-01',
               '2019-11-01', '2019-12-01'],
              dtype='datetime64[ns]', name='date', length=108, freq=None)

The data in the problematic DataFrame is real numbers:

enter image description here

My code to generate the plot is almost verbatim to the toy example:

residuals = pd.DataFrame(overfit.predict(X_train)-y_train)

#https://stackoverflow.com/questions/49231052/datetime-x-axis-matplotlib-labels-causing-uncontrolled-overlap
#plt.bar(pd.to_datetime(residuals.index), residuals['efs'].values, ) # this has the same result as below
plt.bar(residuals.index, residuals['efs'], )
plt.gca().xaxis.set_major_locator(mdates.DayLocator((1,15)))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter("%d %b %Y"))
plt.gcf().autofmt_xdate()
plt.title("foo")

plt.show()

But the result looks terrible -- the axes overlpap + lots of the data bars vanish:

enter image description here

This is the original answer toy code for comparison:

import pandas as pd
import numpy as np; np.random.seed(42)
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

datelist = pd.date_range(pd.datetime(2017, 1, 1).strftime('%Y-%m-%d'), periods=93).tolist()
df = pd.DataFrame(np.cumsum(np.random.randn(93)), 
                  columns=['error'], index=pd.to_datetime(datelist))

plt.bar(df.index, df["error"].values)
plt.gca().xaxis.set_major_locator(mdates.DayLocator((1,15)))
plt.gca().xaxis.set_major_formatter(mdates.DateFormatter("%d %b %Y"))
plt.gcf().autofmt_xdate()
plt.show()
user3556757
  • 3,469
  • 4
  • 30
  • 70
  • Looks pretty good on my end, seeing days 1 and 15's only. – Quang Hoang Dec 04 '20 at 05:07
  • @QuangHoang yeah, the toy works. I'm saying my working example, which to me seems identical to the toy, gives weird result (bars are wrong and the axis no good) – user3556757 Dec 04 '20 at 05:33
  • 1
    Only problem is your real index already consists of days `1` of the months. And you have 108 of those. So you are looking at 216 ticks (for 1 and 15). Of course they would overlap :-). – Quang Hoang Dec 04 '20 at 05:38
  • that's what the `plt.gcf().autofmt_xdate()` incantatiuon from the original issue was mean to fix. – user3556757 Dec 04 '20 at 06:03
  • Once you add the proper frequency (`freq=1M`) to your toy set, it breaks too: `datelist = pd.date_range(pd.datetime(2017, 1, 1).strftime('%Y-%m-%d'), periods=108, freq="1M").tolist()` – Asmus Dec 04 '20 at 08:17
  • 2
    So the solution is to use proper tick locations and bar widths, e.g.: `plt.bar(df.index, df["error"].values, width=28)` and `plt.gca().xaxis.set_major_locator(mdates.MonthLocator((1,6)))` – Asmus Dec 04 '20 at 08:21
  • [Plotly](https://plotly.com/python/bar-charts/) handles dates very well and supports interactive feature like zooming. – Jacob K Dec 04 '20 at 15:43
  • `autofmt_xdate()` will not override your locator or formatter which as @QuangHoang says specifies too many ticks. You just need to change the locator to not specify so many ticks (or try the default locator, which is pretty good). – Jody Klymak Dec 04 '20 at 18:14

0 Answers0