0

My program takes n sets of data and plots their histograms.

I. How to label the vertical lines independent of the height of the plot?

A vertical line indicates the most frequent value in a dataset. I want to add a label indicating the value, say 20% from the top. When using matplotlib.pyplot.text() I had to manually assign x and y values. Depending up on the dataset the text goes way up or way down which I don't want to happen.

matplot.axvline(most_common_number, linewidth=0.5, color='black')
matplot.text(most_common_number + 3, 10, str(most_common_number),
    horizontalalignment='center', fontweight='bold', color='black')

I also tried setting the label parameter of matplotlib.pyplot.axvline() but it only adds to the legend of the plot.

matplot.axvline(most_common_number, linewidth=0.5, color='black', label=str(most_common_number))

enter image description here

I wonder if there is a way to use percentages so the text appears n% from the top or use a different method to label the vertical lines. Or am I doing this all wrong?

II. How to make the ticks on x-axis to be spaced out better on resulting image?

I want the x-axis ticks to be factors of 16 so I had to override the defaults. This is where the trouble began. When I save the plot to a PNG file, the x-axis looks really messed up.

The ticks on the x-axis need to be spaced out better

But when I use show() it works fine: enter image description here

Program Snippet

kwargs = dict(alpha=0.5, bins=37, range=(0, 304), density=False, stacked=True)
fig, ax1 = matplot.subplots()

colors = ['tab:blue', 'tab:orange', 'tab:green', 'tab:red', 'tab:purple', 'tab:brown', 'tab:pink', 'tab:gray', 'tab:olive', 'tab:cyan']

count = 0
'''
datasets = [('dataset name', ['data'])]
'''
for item in datasets:
    dataset = item[1]
    most_common_number = most_common(dataset)
    ax1.hist(dataset, **kwargs, label=item[0], color=colors[count])
    matplot.axvline(most_common_number, linewidth=0.5, color='black')
    matplot.text(most_common_number + 3, 10, str(most_common_number),
        horizontalalignment='center', fontweight='bold', color='black')
    count += 1

#for x-axis
loc = matplotticker.MultipleLocator(base=16) # this locator puts ticks at regular intervals
ax1.xaxis.set_major_locator(loc)
#for y-axis
y_vals = ax1.get_yticks()
ax1.set_yticklabels(['{:3.1f}%'.format(x / len(datasets[0][1]) * 100) for x in y_vals])

#set title
matplot.gca().set(title='1 vs 2 vs 3')
#set subtitle
matplot.suptitle("This is a cool subtitle.", va="bottom", family="overpass")

matplot.legend()

fig = matplot.gcf()
fig.set_size_inches(16, 9)

matplot.savefig('out.png', format = 'png', dpi=120)
matplot.show()
Finch
  • 205
  • 2
  • 10
  • 1
    What's the question? Seems you know the solution already? – ImportanceOfBeingErnest Sep 20 '19 at 14:14
  • I don't know the solution for the first part. The arbitrary `x`, `y` values I give for `matplotlib.text()` doesn't hold for all datasets. Sometimes the plot is scaled in such a way that the text goes way up or way down, sometimes even invisible! @ImportanceOfBeingErnest – Finch Sep 20 '19 at 14:23
  • Ah ok. What exactly do you mean by *n%*? Percent of what? The value of the bin within which the annotation occurs? – ImportanceOfBeingErnest Sep 20 '19 at 14:30
  • If the height of the plot is 100, I want the text beside the vertical lines to be at say 80. The problem is for one graph the value is 80 but for another graph the value is 240. Is there a way to tell the text to be always displayed at 80% the height of the plot no matter how big or small the plot is due to the values of the dataset? – Finch Sep 20 '19 at 15:30
  • Ah, % of the height of the plot is easy. `ax.text(most_common_number, 0.8, "text", transform=ax.get_xaxis_transform())` – ImportanceOfBeingErnest Sep 20 '19 at 15:36
  • This is it! This works for me. Thank you. – Finch Sep 20 '19 at 16:09

1 Answers1

0

I. How to label the vertical lines independent of the height of the plot?

It can be done in two ways:

Axes limits

matplotlib.pyplot.xlim and matplotlib.pyplot.ylim

ylim() will give the max and min values of the axis. eg: (0.0, 1707.3)

matplot.text(x + matplot.xlim()[1] * 0.02 , matplot.ylim()[1] * 0.8,
        str(most_common_number),,
        horizontalalignment='center', fontweight='bold', color='black')

(x + matplot.xlim()[1] * 0.02 means at x but 2% to the right. Because you don't want the text to coincide on the vertical line it labels.

matplot.ylim()[1] * 0.8 means at 80% height of the y-axis.

Or you can directly specify x and y as scale (eg: 0.8 of an axis) using transform parameter:

matplot.text(most_common_number, 0.8,
    '        ' + str(most_common_number), transform=ax1.get_xaxis_transform(),
    horizontalalignment='center', fontweight='bold', color='black')

Here y = 0.8 means at 80% height of y-axis.

II. How to make the ticks on x-axis to be spaced out better on resulting image?

Use matplotlib.pyplot.gcf() to change the dimensions and use a custom dpi (otherwise the text will not scale properly) when saving the figure.

gcf() means "get current figure".

fig = matplot.gcf()
fig.set_size_inches(16, 9)
matplot.savefig('out.png', format = 'png', dpi=120)

So the resulting image will be (16*120, 9*120) or (1920, 1080) px.

Finch
  • 205
  • 2
  • 10