2

I'm trying to create a scatterplot for a bunch of probability values for two labels, but when I plot it, the labels appear left- and right-justified so there's a bunch of empty space in-between. Is there a way to narrow the gap between the two x-axis tick marks?

Here's the code I used:

x = [1,2]
y = [[0.1, 0.6, 0.9],[0.5,0.7,0.8]]
colors = ['magenta', 'blue']
plt.title("Algorithm comparison - p-values")
for xe, ye,c in zip(x, y,colors):
    plt.scatter([xe] * len(ye), ye, c=c)
plt.xticks([1,2])
plt.axes().set_xticklabels(['Part 1','Part 2'],rotation = 45)

Thanks in advance! Please let me know if I left out any important details.

Mav
  • 115
  • 9

1 Answers1

2

I have rewritten the code as closely as possible. The main problem is that you generate two axes objects - one with for xe, ye,c in zip(x, y,colors): plt.scatter([xe] * len(ye), ye, c=c) and another one with plt.axes().set_xticklabels(['Part 1','Part 2'],rotation = 45). You can provide the labels instead directly with plt.xticks():

import matplotlib.pyplot as plt

x = [1,2]
y = [[0.1, 0.6, 0.9],[0.5,0.7,0.8]]
colors = ['red', 'blue']
plt.title("Algorithm comparison - p-values")
for xe, ye,c in zip(x, y,colors):
    plt.scatter([xe] * len(ye), ye, c=c)
plt.xticks([1,2], ['Part 1','Part 2'], rotation = 45)

plt.show()

Preferably, you create one axis object in the beginning and plot everything using this axis object:

import matplotlib.pyplot as plt

fig, ax = plt.subplots()
x = [1,2]
y = [[0.1, 0.6, 0.9],[0.5,0.7,0.8]]
colors = ['red', 'blue']
ax.set_title("Algorithm comparison - p-values")
for xe, ye, c in zip(x, y, colors):
    ax.scatter([xe] * len(ye), ye, c=c)
ax.set_xticks(x)
ax.set_xticklabels(['Part 1','Part 2'],rotation = 45)
ax.set_xlim(0.5, 2.5)
plt.show()

Sample output: enter image description here

For more explanations regarding differences between the object-oriented approach and the plt interface can be found here and the matplotlib documentation.

Mr. T
  • 11,960
  • 10
  • 32
  • 54
  • Thanks so much! Out of curiosity, how would one go about plotting this, but if there were two different sets of data that each had a "part 1" and "part 2"? Like this image, but with the scatter above? https://www.tutorialspoint.com/matplotlib/images/multiple_bar_charts.jpg – Mav Jan 22 '22 at 02:53
  • Seaborn generates this easily as [stripplots](https://seaborn.pydata.org/generated/seaborn.stripplot.html) if you have already a pandas dataframe. Since seaborn is based on matplotlib, you can of course imitate this. Simply offset the x-values, e.g., 0.9 and 1.1 for `part 1` and 1.9 and 2.1 for `part 2`, for multiple scatter groups. Remember, we have just renamed the numerical values 1 and 2 on the x-axis and labeled them with `part 1` and `part 2` but they are still these numerical values – Mr. T Jan 22 '22 at 10:30