0

This is my following Code:

import pandas as pd
import matplotlib.pyplot as plt
import matplotlib
data = pd.read_csv('wages.csv')
dataFrame = data[(data['YEAR'] == 2019) & (data['Geography'] == 'Canada')]

education_group = dataFrame.groupby(['Education level'])
education_group['Male','Both Sexes','Female'].mean()

Which produces the following Output:

Output

I've tried several different ways to try grouping it together and plotting it but just resulted in an overall mess.

Essentially, I just want the x-axis to have each unique Education level with it having 3 Bars associated with it (Male, Female, Both sexes) if that makes sense.

If someone could point me to the right direction that'd be great, thanks!

edit; need it plotted as a Bar Graph.

Jona
  • 327
  • 4
  • 19
  • Is using `seaborn` package an option? – xicocaio Mar 30 '21 at 20:05
  • Yup! Anything to be honest really. – Jona Mar 30 '21 at 20:06
  • Could you provide sample data? [Look at this](https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples). – xicocaio Mar 30 '21 at 20:07
  • I posted an answer, hope it helps =) . But please refrain from providing external links to your data in the future. Not only it is discouraged by [SO guidelines](https://stackoverflow.com/help/how-to-ask), but also you will notice that fewer people will feel motivated to put the effort to answer your questions. – xicocaio Mar 30 '21 at 20:37
  • @xicocaio Apologies! I've deleted it. Taking a look at your answer right now, thank you! ;D – Jona Mar 30 '21 at 20:48
  • No problem, let me know if I answered correctly, I can make adjustments if needed. – xicocaio Mar 30 '21 at 20:59

1 Answers1

1

Considering the following sample data.

df = pd.DataFrame({'education': {0:'grad', 1:'masters', 2:'high school'},
                  'Male': {0: 10, 1: 9, 2: 8},
                   'Female': {0: 1, 1: 3, 2: 5},
                   'Both Sexes': {0: 2, 1: 4, 2: 6}})
education       Male    Female  Both Sexes
grad            10      1       2
masters         9       3       4
high school     8       5       6

You first melt the target columns with gender information to extract categories, while keeping the others intact.

df_to_plot = pd.melt(df, id_vars=['education'], value_vars=['Male', 'Female', 'Both Sexes'], var_name= 'gender', value_name='wage')
education       gender          wage
grad            Male            10
masters         Male            9
high school     Male            8
grad            Female          1
masters         Female          3
high school     Female          5
grad            Both Sexes      2
masters         Both Sexes      4
high school     Both Sexes      6

And finally following the seaborn documentation we use the following code

import seaborn as sns
sns.barplot(x="education", y="wage", hue='gender', data=df_to_plot)

To get the following plot

enter image description here

xicocaio
  • 867
  • 1
  • 10
  • 27