3

I'm trying to plot two separate things from two pandas dataframes but the x-axis is giving some issues. When using matplotlib.ticker to skip x-ticks, the date doesn't get skipped. The result is that the x-axis values doesn't match up with what is plotted.

For example, when the x-ticks are set to a base of 2, you'll see that the dates are going up by 1.

base of 2

But the graph has the same spacing when the base is set to 4, which you can see here:

base of 4

For the second image, the goal is for the days to increase by 4 each tick, so it should read 22, 26, 30, etc.

Here is the code that I'm working with:

ax = plot2[['Date','change value']].plot(x='Date',color='red',alpha=1,linewidth=1.5)
plt.ylabel('Total Change')
plot_df[['Date','share change daily']].plot(x='Date',secondary_y=True,kind='bar',ax=ax,alpha=0.4,color='black',figsize=(6,2),label='Daily Change')
plt.ylabel('Daily Change')
ax.legend(['Total Change (L)','Daily Change'])
plt.xticks(plot_df.index,plot_df['Date'].values)

myLocator = mticker.MultipleLocator(base=4)
ax.xaxis.set_major_locator(myLocator)

Any help is appreciated! Thanks :)

Phanster
  • 65
  • 6
  • 2
    Is your `Date` column datetime type or string type? – Quang Hoang Oct 19 '20 at 14:13
  • You should try passing the argument to `plt.xticks` twice, the function takes the first arg as the tick locations and the second one as the labels. It's possible you've modified the ticks, but the labels are taking the old values (i.e. the values from the underlying data) – Andrew Oct 19 '20 at 14:27
  • Changing it to a datetime column made no difference, and setting it to the index also made no difference. I'm not sure what you mean Andrew. Do you simply mean to have two of the lines that start with plt.xticks? If so that didn't do anything either. – Phanster Oct 20 '20 at 11:34
  • 1
    This question is related to these ones [here](https://stackoverflow.com/q/48790378/14148248), [here](https://stackoverflow.com/q/42880333/14148248), [here](https://stackoverflow.com/q/41640651/14148248), [here](https://stackoverflow.com/q/30133280/14148248), and [here](https://stackoverflow.com/q/45704366/14148248). – Patrick FitzGerald Jan 07 '21 at 20:54

1 Answers1

1

First off, I suggest you set the date as the index of your dataframe. This lets pandas automatically format the date labels nicely when you create line plots and it lets you conveniently create a custom format with the strftime method.

This second point is relevant to this example, seeing as plotting a bar plot over a line plot prevents you from getting the pandas line plot date labels because the x-axis units switch to integer units starting at 0 (note that this is also the case when you use the dates as strings instead of datetime objects, aka timestamp objects in pandas). You can check this for yourself by running ax.get_xticks() after creating the line plot (with a DatetimeIndex) and again after creating the bar plot.

There are too many peculiarities regarding the tick locators and formatters, the pandas plotting defaults, and the various ways in which you could define your custom ticks and tick labels for me to go into more detail here. So let me suggest you refer to the documentation for more information (though for your case you don't really need any of this): Major and minor ticks, Date tick labels, Custom tick formatter for time series, more examples using ticks, and the ticker module which contains the list of tick locators and formatters and their parameters.

Furthermore, you can identify the default tick locators and formatters used by the plotting functions with ax.get_xaxis().get_major_locator() or ax.get_xaxis().get_major_formatter() (you can do the same for the y-axis, and for minor ticks) to get an idea of what is happening under the hood.

On to solving your problem. Seeing as you want a fixed frequency of ticks for a predefined range of dates, I suggest that you avoid explicitly selecting a ticker locator and formatter and that instead you simply create the list of ticks and tick labels you want. First, here is some sample data similar to yours:

import numpy as np                 # v 1.19.2
import pandas as pd                # v 1.1.3
import matplotlib.pyplot as plt    # v 3.3.2

rng = np.random.default_rng(seed=1) # random number generator

dti = pd.bdate_range(start='2020-07-22', end='2020-09-03')
daily = rng.normal(loc=0, scale=250, size=dti.size)
total = -1900 + np.cumsum(daily)

df = pd.DataFrame({'Daily Change': daily,
                   'Total Change': total},
                  index=dti)
df.head()
            Daily Change  Total Change
2020-07-22     86.396048  -1813.603952
2020-07-23    205.404536  -1608.199416
2020-07-24     82.609269  -1525.590147
2020-07-27   -325.789308  -1851.379455
2020-07-28    226.338967  -1625.040488

The date is set as the index, which will simplify the code for creating the plots (no need to specify x). I use the same formatting arguments as in the example you gave, except for the figure size. Note that for setting the ticks and tick labels I do not use plt.xticks because this refers to the secondary Axes containing the bar plot and for some reason, the rotation and ha arguments get ignored.

label_daily, label_total = df.columns

# Create pandas line plot: note the 'use_index' parameter
ax = df.plot(y=label_total, color='red', alpha=1, linewidth=1.5,
             use_index=False, ylabel=label_total)

# Create pandas bar plot: note that the second ylabel must be created
# after, else it overwrites the previous label on the left
df.plot(kind='bar', y=label_daily, color='black', alpha=0.4,
        ax=ax, secondary_y=True, mark_right=False, figsize=(9, 4))
plt.ylabel(label_daily, labelpad=10)

# Place legend in a better location: note that because there are two
# Axes, the combined legend can only be edited with the fig.legend
# method, and the ax legend must be removed
ax.legend().remove()
plt.gcf().legend(loc=(0.11, 0.15))

# Create custom x ticks and tick labels
freq = 4 # business days
xticks = ax.get_xticks()
xticklabels = df.index[::freq].strftime('%b-%d')
ax.set_xticks(xticks[::freq])
ax.set_xticks(xticks, minor=True)
ax.set_xticklabels(xticklabels, rotation=0, ha='center')

plt.show()

pandas_twinax_line_bar


The codes for formatting the dates can be found here.


For the sake of completeness, here are two alternative ways of creating exactly the same ticks but this time by making explicit use of matplotlib tick locators and formatters.

This first alternative uses lists of ticks and tick labels like before, but this time passing them to FixedLocator and FixedFormatter:

import matplotlib.ticker as mticker

# Create custom x ticks and tick labels
freq = 4 # business days
maj_locator = mticker.FixedLocator(ax.get_xticks()[::freq])
min_locator = mticker.FixedLocator(ax.get_xticks())
ax.xaxis.set_major_locator(maj_locator)
ax.xaxis.set_minor_locator(min_locator)

maj_formatter = mticker.FixedFormatter(df.index[maj_locator.locs].strftime('%b-%d'))
ax.xaxis.set_major_formatter(maj_formatter)
plt.setp(ax.get_xticklabels(), rotation=0, ha='center')

This second alternative makes use of the option to create a tick at every nth position of the index when using IndexLocator, combining it with FuncFormatter (instead of IndexFormatter which is deprecated):

import matplotlib.ticker as mticker

# Create custom x ticks and tick labels
maj_freq = 4 # business days
min_freq = 1 # business days
maj_locator = mticker.IndexLocator(maj_freq, 0)
min_locator = mticker.IndexLocator(min_freq, 0)
ax.xaxis.set_major_locator(maj_locator)
ax.xaxis.set_minor_locator(min_locator)

maj_formatter = mticker.FuncFormatter(lambda x, pos=None:
                                      df.index[int(x)].strftime('%b-%d'))
ax.xaxis.set_major_formatter(maj_formatter)
plt.setp(ax.get_xticklabels(), rotation=0, ha='center')

As you can see, both of these alternatives are more verbose than the initial example.

Patrick FitzGerald
  • 3,280
  • 2
  • 18
  • 30