2

I want to create a histogram in python similar to the following histogram.

enter image description here

However, instead of absolute values, I would like to have percentage values on the y axis. So basically, I would like to write a function that goes through my data frame and for each 'purpose' it generates a percentage of accounts that were fully paid vs. not fully paid.

I tried to write a function to do this, but they end up being extremely long. Is there a simple way to do this with pandas in python.

Basically, my thought process is the following:

  • This graph doesn't tell me much about the percentage of people that defaulted on their loans. For example, the debt_consolidation category has more people defaulting than the credit_card category, but there are also more people in that category. Therefore, I would like to graph the percentages.

The dataframe that I'm working with is shown below.

The code for the original histogram is:

plt.figure(figsize = (10,6))
sns.countplot(df['purpose'],hue = df['not.fully.paid'], palette = 
"coolwarm")

enter image description here

bugsyb
  • 5,662
  • 7
  • 31
  • 47

0 Answers0