Hiding xticks labels every n-th label or on value on Pandas plot / make x-axis readable

Question

The question is pretty long because of the pictures, but there isn't much content in reality. Question at the bottom.

Hi, I have a series of 30000 samples of ages ranging from 21 to 74. Series head:

0    24
1    26
2    34
3    37
4    57
Name: AGE, dtype: int64

I plot it using built-in Pandas feature .plot:

age_series = original_df['AGE']
fig = plt.figure()
fig.suptitle('Age distribution')
age_series.value_counts().sort_index().plot(kind='bar')

My problem is that it makes the x-axis not really user-friendly:

I could increase the horizontal width between bars, but I don't want to do that. Instead, I'd like to make only a subset of the x-axis labels visible. I tried using MaxNLocator and MultipleLocator adding this line:

plt.gca().xaxis.set_major_locator(plt.MaxNLocator(10))

However, it doesn't achieve my goals, as it now incorrectly labels bars and removes ticks (which I understand since using these functions change the xticks object):

An ugly solution is to loop within the xticks object:

xticks = plt.gca().xaxis.get_major_ticks()
for i in range(len(xticks)):
    if i % 10 != 0:
        xticks[i].set_visible(False)

Allowing this render, which is close to what I want:

I'm not satisfied however, as the loop is too naive. I'd like to be able to access values from the xticks (the label) and make a decision upon it, to be able to show only multiple of 10 labels.

This works (based upon this answer):

for i, l in enumerate(labels):
    val = int(l.get_text())
    if val % 10 != 0:
        labels[i] = ''
    plt.gca().set_xticklabels(labels)

Question: Is there any different solution, which feels more Pythonic/efficient ? Or do you have suggestions on how to make this data readable ?

Does this answer your question? [Changing the "tick frequency" on x or y axis in matplotlib?](https://stackoverflow.com/questions/12608788/changing-the-tick-frequency-on-x-or-y-axis-in-matplotlib) — Mr. T, Feb 04 '21 at 19:46

Joe · Answer 1 · 2021-05-31T20:49:50.550

6

I think you could try something like this:

ax = plt.gca()
pos = [9,19,29,39,49]
l = [30,40,50,60,70]
ax.set(xticks=pos, xticklabels=l)

edited May 31 '21 at 20:49

answered Jun 12 '18 at 15:12

Joe

12,057
5
39
55

I modified it, check it now @JeanRostan Before I didnt notice that the values were starting from 21 and not from 0 – Joe Jun 12 '18 at 15:31
Thanks, it does work and is cleaner than the ugly loop. I accepted the answer below you as it's more generic, but thanks a lot. – Jean Rostan Jun 12 '18 at 16:05

score 6 · Accepted Answer · answered Jun 12 '18 at 15:24

6

To be more generic you could do something like that:

import numpy as np

ax = plt.gca()

max_value = original_df['AGE'].max()
min_value = original_df['AGE'].min()
number_of_steps = 5
l = np.arange(min_value, max_value+1, number_of_steps)

ax.set(xticks=l, xticklabels=l)

answered Jun 12 '18 at 15:24

Barthelemy Pavy

520
3
7

2

Thanks, that's what I was looking for, it's a lot cleaner than randomly looping. However, it needs a slight adjustment for the position: using `xticks=l` will make the ticks shift on the right since my starting data point is 21. Here's the fix I added: `ax.set(xticks=[x - l[0] for x in l], xticklabels=l)` – Jean Rostan Jun 12 '18 at 16:07
No, I keep them but I relocate them using my first real value, so they are properly placed. If you omit xticks, all the values are 1-spaced, which makes mislabelling. – Jean Rostan Jun 12 '18 at 16:19
Ah yes, I didn't see your solution, maybe you could do just: `plt.xticks(l)` Instead of : `ax = plt.gca() ax.set(xticks=l, xticklabels=l)` But I have nothing to try – Barthelemy Pavy Jun 12 '18 at 16:21
Doesn't work, labels are 5-spaced, but labels value are 1-spaced (there are 5 space between label 21 and 22), and they are shifted on the right. Thanks nonetheless, your initial solution is what I was looking for. – Jean Rostan Jun 12 '18 at 16:27

score 0 · Answer 3 · answered Jun 13 '18 at 08:18

You could calculate all multiples of ten within your range of Ages and put it in your plot command via xticks kwarg:

age_series = original_df['AGE']

xt = np.arange(age_series.min(), age_series.max()+1)
xt = xt[xt%10==0]

fig = plt.figure()
fig.suptitle('Age distribution')
age_series.value_counts().sort_index().plot(kind='bar', xticks=xt)

Hiding xticks labels every n-th label or on value on Pandas plot / make x-axis readable

3 Answers3