0

EDIT: My question is not a duplicate from: Given a pandas Series that represents frequencies of a value, how can I turn those frequencies into percentages? Because I ask for a plot not for a frequency table. This question is misclassified.

I am trying to replicate a graph bar with frequency or percent for a string variable in Python.

I am able to get this using Stata, but I am failed with Python. The Stata code (below I show my Python code):

clear all
input str10 Genero
"Mujer"
"Mujer"
"Hombre"
"Hombre"
"Hombre"
end

graph bar (percent), over(Genero)

Stata

Python code with the same data but failed plot:

import numpy as np
import os
import seaborn as sns
import pandas as pd
import matplotlib.pyplot as plt

os.chdir("C:/Users/Desktop")
import matplotlib.ticker as mtick

df = pd.DataFrame({'Genero': ['Mujer','Mujer','Hombre','Hombre','Hombre']})

print(df)


axx = df.plot.bar(x='Genero')
axx.yaxis.set_major_formatter(mtick.PercentFormatter())
plt.savefig('myfilename.png')
  • As per the duplicates `vc = df.Genero.value_counts(normalize=True).mul(100)` and `ax = vc.plot(kind='bar', title='Percent per group', ylabel='Percent', rot=0)` is the correct implementation. This question is a duplicate. – Trenton McKinney Jul 08 '23 at 17:37

1 Answers1

-1

Seems fairly straight forward enough.

df['Genero'].value_counts() #This gives you the value counts of your dataframe
x = df['Genero'].value_counts().index.tolist() #Your xaxis groups
y = (df['Genero'].value_counts().values.tolist()/df['Genero'].value_counts().values.sum()*100) #Your yaxis values as %s

fig, axs = plt.subplots(1, figsize=(20,10))
axs.set_title("TITLE")
axs.set_xlabel('XLABEL')
axs.set_ylabel('YLABEL')

axs.bar(x,y)

plt.show()

The code could be a little cleaner perhaps someone has a better way of doing it but for your purposes should be ok.

IbbyR
  • 26
  • 7