0

I am try to show one standard deviation above and below mean value of a data list in a box plot by using matplotlib. May i know how to realize it by using .boxplot()? or any other way can achieve it?

  • If you add and subtract the standard deviation from your data, you can use the `fill_between` function. – Jetman Apr 17 '19 at 07:24
  • 1
    Please take a look at [this similar question](https://stackoverflow.com/questions/17725927/boxplots-in-matplotlib-markers-and-outliers), and check whether your data is actually normal distributed. If it is, you should be able to use the `whis=` option in `boxplot()`, see the [documentation](https://matplotlib.org/api/pyplot_api.html#matplotlib.pyplot.boxplot). Note that such a plot might confuse readers. Also see [this alternative](https://stackoverflow.com/a/33330997/565489). – Asmus Apr 17 '19 at 10:37

1 Answers1

0

I have figured it out, it should be like the following:

import matplotlib.pyplot as plt
import numpy as np

data_to_plot = [[0.61,0.62,0.6,0.62,0.64,0.63,0.61,0.6,0.57,0.62,0.6,0.62,0.6,0.64,0.61,0.6,0.64,0.58,0.6,0.62],
                [0.66,0.63,0.63,0.64,0.62,0.67,0.62,0.68,0.58,0.64,0.6,0.64,0.57,0.63,0.59,0.64,0.61,0.58,0.63,0.67],
                [0.6,0.58,0.59,0.6,0.61,0.57,0.63,0.6,0.57,0.6,0.6,0.6,0.61,0.59,0.59,0.59,0.64,0.59,0.58,0.62],
                [0.84,0.77,0.83,0.84,0.76,0.74,0.81,0.8,0.83,0.74,0.82,0.8,0.8,0.78,0.81,0.73,0.79,0.8,0.74,0.69]]

positions = np.arange(4) + 1

bp = plt.boxplot(data_to_plot,
                 showmeans=True,
                 positions=positions,
                 labels=['ReadUnCommit','ReadCommit','RepeatableRead','Serializable'])

means = [np.mean(data) for data in data_to_plot]
above_dev = [np.mean(data)+np.std(data) for data in data_to_plot]
under_dev = [np.mean(data)-np.std(data) for data in data_to_plot]
maxV = [np.max(data) for data in data_to_plot]
minV = [np.min(data) for data in data_to_plot]
plt.plot(positions, above_dev, 'rs')
plt.plot(positions, under_dev, 'bs')
plt.plot(positions, maxV, 'ks')
plt.plot(positions, minV, 'ys')
plt.xlabel("isolation level")
plt.ylabel("average execution time")
plt.title('S=100,E=20,P=100')

plt.show()