0

I am doing my very first project with Pandas and this is the data frame I currently have: click me I need to calculate the percentage of bachelor degrees, which I read is done by dividing the value by the sum of all the values and then multiplying the sum by 100. Now, I need to sum all the bachelor degrees and all the rest of the educational values. My question is, when I have objects as values, in order to sum them, do I have to convert them in float/int necessarly?

I tried to use the .count() function, which actually counts all the values, but I feel like this is not the best/preferred method:

bachelor_number = bachelor_df['education'].count()
total_number = demographic_data_df['education'].count()
bachelor_percentage = (bachelor_number * 100) / (total_number)
print('The percentage of people who have a degree is {}%'.format(round(bachelor_percentage, 1)))

Any advice for a newbie would be super appreciated. Thanks

Lore MID
  • 1
  • 2
  • If they are not numeric, then you will be using a different overload function. Simple example `'a' + 'b'` produces `'ab'`. Now assume you had `'1' +'3'` this will give `'13'` But is that what you want? eg if you had lists `sum([[1],[2]], [])` will result in `[1,2]` is that what you want? So the objective will determine as to whether the data need be of a particular type – Onyambu Feb 28 '23 at 08:41

1 Answers1

0

Use the value counts() function instead. Something like bachelor_df['education'].value_counts() should give you what you want.

S.Chauhan
  • 156
  • 2
  • 8