6

I want to create a matplotlib bar plot that has the look of a stacked plot without being additive from a multi-index pandas dataframe.

The below code gives the basic behaviour

%matplotlib notebook
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import io

data = io.StringIO('''Fruit,Color,Price
Apple,Red,1.5
Apple,Green,1.0
Pear,Red,2.5
Pear,Green,2.3
Lime,Green,0.5
Lime, Red, 3.0
''')
df_unindexed = pd.read_csv(data)
df_unindexed
df = df_unindexed.set_index(['Fruit', 'Color'])
df.unstack().plot(kind='bar')

The plot command df.unstack().plot(kind='bar') shows all the apple prices grouped next to each other. If you choose the option df.unstack().plot(kind='bar',stacked=True) - it adds the prices for Red and Green together and stacks them.

I am wanting a plot that is halfway between the two - it shows each group as a single bar, but overlays the values so you can see them all. The below figure (done in powerpoint) shows what behaviour I am looking for -> I want the image on the right.

Short of calculating all the values and then using the stacked option, is this possible?

example bar plot

Esme_
  • 1,360
  • 3
  • 18
  • 30
  • plot group-by data separately on the same axis – pcu Jan 04 '19 at 10:28
  • Possible duplicate of [How to create overlay bar plot in pandas](https://stackoverflow.com/questions/50158081/how-to-create-overlay-bar-plot-in-pandas) – pcu Jan 04 '19 at 10:33

2 Answers2

2

This seems (to me) like a bad idea, since this representation leads to several problem. Will a reader understand that those are not staked bars? What happens when the front bar is taller than the ones behind?

In any case, to accomplish what you want, I would simply repeatedly call plot() on each subset of the data and using the same axes so that the bars are drawn on top of each other. In your example, the "Red" prices are always higher, so I had to adjust the order to plot them in the back, or they would hide the "Green" bars.

fig,ax = plt.subplots()

my_groups = ['Red','Green']
df_group = df_unindexed.groupby("Color")

for color in my_groups:
    temp_df = df_group.get_group(color)
    temp_df.plot(kind='bar', ax=ax, x='Fruit', y='Price', color=color, label=color)

enter image description here

Diziet Asahi
  • 38,379
  • 7
  • 60
  • 75
  • I understand that there are many datasets where this would be a bad idea, however it happens to be well suited to my data. My data is strictly monotonically increasing within each group (and between groups as well it happens). There are some important patterns in the data that are just easier to see with fewer bars cluttering up the plot. Thanks for the solution - I'll be sure to have a very clear caption to avoid confusion with stacked bars (which don't make sense for this data anyway). – Esme_ Jan 07 '19 at 08:37
  • Can you please explain to me how to add the exact prices on top of each bar as I also have a plot with a similar scenario? – Vivek Mar 15 '21 at 00:39
0

There are two problems with this kind of plot. (1) What if the background bar is smaller than the foreground bar? It would simply be hidden and not visible. (2) A chart like this is not distinguishable from a stacked bar chart. Readers will have severe problems interpreting it.

That being said, you can plot both columns individually.

import matplotlib.pyplot as plt
import pandas as pd
import io

data = io.StringIO('''Fruit,Color,Price
Apple,Red,1.5
Apple,Green,1.0
Pear,Red,2.5
Pear,Green,2.3
Lime,Green,0.5
Lime,Red,3.0''')

df_unindexed = pd.read_csv(data)
df = df_unindexed.set_index(['Fruit', 'Color']).unstack()
df.columns = df.columns.droplevel()

plt.bar(df.index, df["Red"].values, label="Red")
plt.bar(df.index, df["Green"].values, label="Green")
plt.legend()
plt.show()

enter image description here

ImportanceOfBeingErnest
  • 321,279
  • 53
  • 665
  • 712