27

Consider this simple example

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
import matplotlib.dates as mdates

pd.__version__
Out[147]: u'0.22.0'

idx = pd.date_range('2017-01-01 05:03', '2017-01-01 18:03', freq = 'min')

df = pd.Series(np.random.randn(len(idx)),  index = idx)
df.head()
Out[145]: 
2017-01-01 05:03:00   0.4361
2017-01-01 05:04:00   0.9737
2017-01-01 05:05:00   0.8430
2017-01-01 05:06:00   0.4292
2017-01-01 05:07:00   0.5739
Freq: T, dtype: float64

I want to plot this, and have ticks every hour. I use:

fig, ax = plt.subplots()
hours = mdates.HourLocator(interval = 1)  #
h_fmt = mdates.DateFormatter('%H:%M:%S')

df.plot(ax = ax, color = 'black', linewidth = 0.4)

ax.xaxis.set_major_locator(hours)
ax.xaxis.set_major_formatter(h_fmt)

which gives

enter image description here

why dont the ticks appear every hour here? Thanks for your help!

Trenton McKinney
  • 56,955
  • 33
  • 144
  • 158
ℕʘʘḆḽḘ
  • 18,566
  • 34
  • 128
  • 235

2 Answers2

41

The problem is that while pandas in general directly wraps the matplotlib plotting methods, this is not the case for plots with dates. As soon as dates are involved, pandas uses a totally different numerical representation of dates and hence also uses its own locators for the ticks.

In case you want to use matplotlib.dates formatters or locators on plots created with pandas you may use the x_compat=True option in pandas plots.

df.plot(ax = ax, color = 'black', linewidth = 0.4, x_compat=True)

This allows to use the matplotlib.dates formatters or locators as shown below. Else you may replace df.plot(ax = ax, color = 'black', linewidth = 0.4) by

ax.plot(df.index, df.values, color = 'black', linewidth = 0.4)

Complete example:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

idx = pd.date_range('2017-01-01 05:03', '2017-01-01 18:03', freq = 'min')
df = pd.Series(np.random.randn(len(idx)),  index = idx)

fig, ax = plt.subplots()
hours = mdates.HourLocator(interval = 1)
h_fmt = mdates.DateFormatter('%H:%M:%S')

ax.plot(df.index, df.values, color = 'black', linewidth = 0.4)
#or use
df.plot(ax = ax, color = 'black', linewidth = 0.4, x_compat=True)
#Then tick and format with matplotlib:
ax.xaxis.set_major_locator(hours)
ax.xaxis.set_major_formatter(h_fmt)

fig.autofmt_xdate()
plt.show()

enter image description here


If the motivation to use pandas here is (as stated in the comments below) to be able to use secondary_y, the equivalent for matplotlib plots would be a twin axes twinx.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

idx = pd.date_range('2017-01-01 05:03', '2017-01-01 18:03', freq = 'min')

df = pd.DataFrame(np.cumsum(np.random.randn(len(idx), 2),0), 
                  index = idx, columns=list("AB"))

fig, ax = plt.subplots()
ax.plot(df.index, df["A"], color = 'black')
ax2 = ax.twinx()
ax2.plot(df.index, df["B"], color = 'indigo')

hours = mdates.HourLocator(interval = 1)
h_fmt = mdates.DateFormatter('%H:%M:%S')
ax.xaxis.set_major_locator(hours)
ax.xaxis.set_major_formatter(h_fmt)

fig.autofmt_xdate()
plt.show()

enter image description here

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712
  • 1
    thanks. maybe a workaround to keep Pandas doing the plots would be to get a regular Timestamp index instead of a pandas datetime? – ℕʘʘḆḽḘ Feb 14 '18 at 16:22
  • I would like to keep the Pandas constructor because I use extensively the `secondary_y` argument of plot to show multiple times series... – ℕʘʘḆḽḘ Feb 14 '18 at 16:27
  • 2
    I can't imagine any workaround for this. `matplotlib.dates` assumes the numeric axes to represent days since 0001-01-01 UTC, plus 1. Only if the axes uses this datetime format, it will correctly tick and label the axes. On the other hand pandas will create its axes units depending on the data and use an appropriate locator for those units. The only alternative I could think of would be to manipulate the pandas formatters in use. I am not aware of any coherent way of doing this though. – ImportanceOfBeingErnest Feb 14 '18 at 16:28
  • 2
    The equivalent of secondary_y in matplotlib is to create a `twinx` and plot the second plot to that axes. This will in general cause 3 codelines compared to one, so it's not that bad. – ImportanceOfBeingErnest Feb 14 '18 at 16:30
  • Oh thats great. could you please add that to your answer then? just plot another random column. that would be a perfect solution – ℕʘʘḆḽḘ Feb 14 '18 at 16:31
  • that is assume the data is `df = pd.DataFrame({'value1' : np.random.randn(len(idx)), 'value2' : np.random.randn(len(idx))}, index = idx) df['value2'] = df['value2']* 1000` – ℕʘʘḆḽḘ Feb 14 '18 at 16:33
  • something very strange is that I am not able to rotate the x-axis in the dual axis plot. Do you know how to make the hour labels vertical (rot = 90)? Thanks again for this amazing solution – ℕʘʘḆḽḘ Feb 19 '18 at 01:59
  • 1
    Not sure what the problem is. Here the easiest would be `fig.autofmt_xdate(rotation=90)`. – ImportanceOfBeingErnest Feb 19 '18 at 02:19
3

Solution with pandas only

You can set ticks for every hour by using the timestamps of the DatetimeIndex. The ticks can be created by taking advantage of the datetime properties of the timestamps.

import numpy as np   # v 1.19.2
import pandas as pd  # v 1.1.3

idx = pd.date_range('2017-01-01 05:03', '2017-01-01 18:03', freq='min')
series = pd.Series(np.random.randn(len(idx)), index=idx)

ax = series.plot(color='black', linewidth=0.4, figsize=(10,4))
ticks = series.index[series.index.minute == 0]
ax.set_xticks(ticks)
ax.set_xticklabels(ticks.strftime('%H:%M'));

hour_ticks

Patrick FitzGerald
  • 3,280
  • 2
  • 18
  • 30