I want to create a histogram in python similar to the following histogram.
However, instead of absolute values, I would like to have percentage values on the y axis. So basically, I would like to write a function that goes through my data frame and for each 'purpose' it generates a percentage of accounts that were fully paid vs. not fully paid.
I tried to write a function to do this, but they end up being extremely long. Is there a simple way to do this with pandas in python.
Basically, my thought process is the following:
- This graph doesn't tell me much about the percentage of people that defaulted on their loans. For example, the debt_consolidation category has more people defaulting than the credit_card category, but there are also more people in that category. Therefore, I would like to graph the percentages.
The dataframe that I'm working with is shown below.
The code for the original histogram is:
plt.figure(figsize = (10,6))
sns.countplot(df['purpose'],hue = df['not.fully.paid'], palette =
"coolwarm")