0

I have a dataset and want to visualize a horizontal stacked bar.

The problem is, each data columns of Promotors, Neutrals, and detractors, corresponding to each year (e.g., first, second, and so on) is meant to be 100 (e.g., sum of 21, 46.5, and 32.5 should be 100). However, my visualization result shows that it does not stack to 100.

Any advices? Thanks!

from matplotlib import pyplot as plt

plt.rcParams["figure.figsize"] = [10, 8]
       
year = ['first', 'second', 'third', 'fourth', 'fifth', 'sixth']
promoters = [21, 20.8, 21.8,27,24,20.5]
neutrals = [46.5, 56.0, 54.3,47.8,50.0,52.5]
detractors = [32.5, 23.3, 24.0,25.3,26.0,27.0]

b1 = plt.barh(year, promoters, color="darkseagreen")
b2 = plt.barh(year, neutrals, left=promoters, color="lightyellow")
b3 = plt.barh(year, detractors, left=neutrals, color="coral")


plt.legend([b1, b2, b3], ["promoters", "neutrals", "detractors"], loc="upper right")
plt.xlim([0, 100])

plt.show

failed version of stacked data

opsv
  • 89
  • 1
  • 3
  • There are several ways you could solve it (add a "unknown" component, or normalize the sum to 100, or something) but the solution is dependant on WHY the numbers don't add up to 100. So this is not really a programming question, but rather a data presentation or visualization question. I think the quesitons fits better in other SE sites, such as https://datascience.stackexchange.com/ – LudvigH Apr 19 '22 at 08:04

2 Answers2

1

sorry, I am not good at English, So I can't really explain it, but I'll put my code here, hopefully you can understand it.

from matplotlib import pyplot as plt
import numpy as np


plt.rcParams["figure.figsize"] = [10, 8]

year = ['first', 'second', 'third', 'fourth', 'fifth', 'sixth']
promoters = [21, 20.8, 21.8, 27, 24, 20.5]
neutrals = [46.5, 56.0, 54.3, 47.8, 50.0, 52.5]
detractors = [32.5, 23.3, 24.0, 25.3, 26.0, 27.0]

"""
This is my code
"""
# The starting point of b3 
detractors_left_arry = np.sum([promoters, neutrals], axis=0).tolist()
print(detractors_left_arry)

b1 = plt.barh(year, promoters, color="darkseagreen")
b2 = plt.barh(year, neutrals, left=promoters, color="lightyellow")
# b3 = plt.barh(year, detractors, left=neutrals, color="coral")
b3 = plt.barh(year, detractors, left=detractors_left_arry, color="coral")

plt.legend([b1, b2, b3], ["promoters", "neutrals", "detractors_left_arry"],                 
loc="upper right")
plt.xlim([0, 100])

plt.show()
qwerdf
  • 36
  • 3
0

The problem is that you are using the wrong reference in the last plt.barh command. The third bar must start at the end of the second one, so you need to sum the first and second values:

from matplotlib import pyplot as plt
       
year = ['first', 'second', 'third', 'fourth', 'fifth', 'sixth']
promoters = [21, 20.8, 21.8,27,24,20.5]
neutrals = [46.5, 56.0, 54.3,47.8,50.0,52.5]
detractors = [32.5, 23.3, 24.0,25.3,26.0,27.0]

plt.figure()
b1 = plt.barh(year, promoters, color="darkseagreen")
b2 = plt.barh(year, neutrals, left=promoters, color="lightyellow")
b3 = plt.barh(year, detractors, left=[t1 + t2 for t1, t2 in zip(promoters, neutrals)], color="coral")

plt.legend([b1, b2, b3], ["promoters", "neutrals", "detractors"], loc="upper right")
plt.xlim([0, 100])

plt.show()

enter image description here

Davide_sd
  • 10,578
  • 3
  • 18
  • 30