0

If I plot a time series as a line with plot(), there is no problem:

from pydataset import data
import pandas as pd
import matplotlib.pyplot as plt
        
economics = data('economics')
economics['date'] = pd.to_datetime(economics.date)

fig, axes = plt.subplots()
axes.plot(economics['date'], economics['unemploy'])

enter image description here

But if I use bar(), the outcome is not what I expect

fig, axes = plt.subplots()
axes.bar(economics['date'], economics['unemploy'])

enter image description here

I get something closer to what I want with pandas' plot.bar():

import matplotlib.ticker as tkr

economics.set_index("date", inplace = True)

ax = economics['unemploy'].plot.bar()

# Display only the year, only for January and only every 5 years
ticklabels = [item.strftime('%Y') if item.month==1 and item.year % 5 ==0 else '' for item in economics.index]
ax.xaxis.set_major_formatter(tkr.FixedFormatter(ticklabels))

enter image description here

But I would like to understand what is wrong with the bars' height in the plot that uses bar()

Julien Massardier
  • 1,326
  • 1
  • 11
  • 29
  • You can set a bar width, e.g. 25 days `ax.bar(economics['date'], economics['unemploy'], width=25)`. Note that a bar plot here looks very unuseful compared to a line plot, especially when you have so many values. – JohanC Feb 07 '22 at 00:17
  • Yup, it was the width. Thank you! – Julien Massardier Feb 07 '22 at 08:40
  • In case someone gets here by accident, the width parameter default value is 0.8. It is expressed in data units (1 unit = 1 day here, given that the x axis are dates). There are approximately 15000 days between the first and last date included in this dataset, so the default width of a bar is 0.8 / 15000 = 0.000053. It means each bar will cover 0.0053 % of the entire x axis width. It is such a small % that most of the bars don't show at all. – Julien Massardier Feb 07 '22 at 14:33

0 Answers0