Sometimes datasets have a number of variables with a selection of other 'things' that contribute to them. It can be useful to show the contribution (e.g. %) to a variable of these different 'things'. However, sometimes not all of the 'things' contribute to all of the variables. When plotting as a bar chart, this leads to spaces when a specific variable does not have a contribution from a 'thing'. Is there a way to just not plot the specific bar for a variable in a bar chart if the contribution of the 'thing' is zero?
An example below shows a selection of variables (a-j) that have various things that could contribute to them (1-5). NOTE: the gaps when the contribution of a 'thing' (1-5) to a variable (a-j) is zero.
from random import randrange
# Make the dataset of data for variables (a-j)
columns = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
data = np.array([np.random.randn(5)**2 for i in range(10)])
df = pd.DataFrame(data.T, columns=columns)
for col in df.columns:
# Set 3 of the 5 'things' to be np.NaN per column
for n in np.arange(3):
idx = randrange(5)
df.loc[list(df.index)[idx], col] = np.NaN
# Normalise the data to 100% of values
df.loc[:,col] = df[col].values / df[col].sum()*100
# Setup plot
figsize = matplotlib.figure.figaspect(.33)
fig = plt.figure(figsize=figsize)
ax = plt.gca()
df.T.plot.bar(rot=0, ax=ax)
# Add a legend and show
plt.legend(ncol=len(columns))
plt.show()