0

I'm trying to visualize a histogram of when individuals had been contacted in a marketing campaign,

The data has values for all months, ranging from Jan to Dec, represented as 1 to 12 in the dataset.

I have the following code which I'm using to generate a the histogram but the x ticks & x tick labels refuse to cooperate,

df['month'].hist()
plt.ylabel('Number of contacts')
plt.xlabel('Month contacted in current campaign')
plt.xticks(ticks = [1,2,3,4,5,6,7,8,9,10,11,12], labels = ['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
#ax.set_xticklabels(months,rotation=45, rotation_mode="anchor", ha="right")
plt.show()

Which is returing this chart, however, the x ticks aren't coming across as giving each month it's non-numerical label (Jan, Feb, Mar, etc.)

enter image description here

Grygger
  • 83
  • 6
  • 2
    Does this answer your question? [matplotlib strings as labels on x axis](https://stackoverflow.com/questions/7559242/matplotlib-strings-as-labels-on-x-axis) – RichieV Sep 01 '20 at 15:11

1 Answers1

1

First off, when the data is discrete, exact bins need to be set to obtain a useful histogram. The bins boundaries should be put carefully between the data values, for example at half values.

Default, there are 10 bins set, divided evenly between the smallest (1) and largest (12) value. In this case at positions 1, 2.1, 3.2, 4.3, 5.4, 6.5, 7.6, 8.7, 9.8, 10.9, 12. So, months 1 and 2 would fall in bin 0, month 3 in bin 1, etc. This is not desired.

Once the bins are set up correctly, plt.xticks() should work as expected.

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt

weights = np.random.uniform(1, 5, 12)
weights /= sum(weights)
df = pd.DataFrame({'month': np.random.choice(range(1, 13), 10000, p=weights)})

df['month'].hist(bins=np.arange(0.5, 13, 1), facecolor='skyblue', edgecolor='white')
plt.ylabel('Number of contacts')
plt.xlabel('Month contacted in current campaign')
plt.xticks(ticks=range(1, 13),
           labels=['Jan', 'Feb', 'Mar', 'Apr', 'May', 'Jun', 'Jul', 'Aug', 'Sep', 'Oct', 'Nov', 'Dec'])
plt.grid(False, axis='x')
plt.tick_params(axis='x', length=0) # hide x tick marks
plt.margins(x=0.01) # less padding near the bars
plt.show()

example plot

JohanC
  • 71,591
  • 8
  • 33
  • 66