1

I'm working on a school project and I'm stuck in making a grouped bar chart. I found this article online with an explanation: https://www.pythoncharts.com/2019/03/26/grouped-bar-charts-matplotlib/ Now I have a dataset with an Age column and a Sex column in the Age column there stand how many years the client is and in the sex is a 0 for female and 1 for male. I want to plot the age difference between male and female. Now I have tried the following code like in the example:

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import pylab as pyl
fig, ax = plt.subplots(figsize=(12, 8))
x = np.arange(len(data.Age.unique()))
# Define bar width. We'll use this to offset the second bar.
bar_width = 0.4
# Note we add the `width` parameter now which sets the width of each bar.
b1 = ax.bar(x, data.loc[data['Sex'] == '0', 'count'], width=bar_width)
# Same thing, but offset the x by the width of the bar.
b2 = ax.bar(x + bar_width, data.loc[data['Sex'] == '1', 'count'], width=bar_width)

This raised the following error: KeyError: 'count'

Then I tried to change the code a bit and got another error:

import pandas as pd
import numpy as np
from matplotlib import pyplot as plt
import pylab as pyl
fig, ax = plt.subplots(figsize=(12, 8))
x = np.arange(len(data.Age.unique()))
# Define bar width. We'll use this to offset the second bar.
bar_width = 0.4
# Note we add the `width` parameter now which sets the width of each bar.
b1 = ax.bar(x, (data.loc[data['Sex'] == '0'].count()), width=bar_width)
# Same thing, but offset the x by the width of the bar.
b2 = ax.bar(x + bar_width, (data.loc[data['Sex'] == '1'].count()), width=bar_width)

This raised the error: ValueError: shape mismatch: objects cannot be broadcast to a single shape

Now how do I count the results that I do can make this grouped bar chart?

Jdiehl
  • 201
  • 3
  • 14
  • What do you mean by age difference ? Average age difference? distribution difference ? – Mayeul sgc Oct 11 '19 at 12:23
  • Please read [mcve] and [How to make good reproducible pandas examples](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). The error you get is simply that there is no column named `"count"` in your dataframe. But one cannot know why that is. – ImportanceOfBeingErnest Oct 11 '19 at 12:26
  • *Now I have a dataset with an Age column and a Sex column*: Your data doesn't have a `'count'` column, you cannot do: `data.loc[data['Sex'] == '0', 'count']` – Quang Hoang Oct 11 '19 at 13:27

1 Answers1

2

It seems like the article goes through too much trouble just to plot grouped chart bar:

np.random.seed(1)
data = pd.DataFrame({'Sex':np.random.randint(0,2,1000),
                     'Age':np.random.randint(20,50,1000)})

(data.groupby('Age')['Sex'].value_counts()        # count the Sex values for each Age
     .unstack('Sex')                              # turn Sex into columns
     .plot.bar(figsize=(12,6))                    # plot grouped bar
)

Or even simpler with seaborn:

fig, ax = plt.subplots(figsize=(12,6))
sns.countplot(data=data, x='Age', hue='Sex', ax=ax)

Output:

enter image description here

Quang Hoang
  • 146,074
  • 10
  • 56
  • 74