Replace null values per country with the min of a column for that country specifically

Question

I'm trying to

step 1. Get the min incidence of malaria for each country
step2 -If a country has a nan value in the 'IncidenceOfMalaria' column, fill nan values with the minimum value of that column FOR THAT VERY COUNTRY AND NOT THE MIN VALUE OF THE ENTIRE COLUMN.

My attempt

malaria_data = pd.read_csv('DatasetAfricaMalaria.csv')
malaria_data["IncidenceOfMalaria"].groupby(malaria_data['CountryName']).min().sort_values()

Stuck at this level. How can I proceed or what would you rather have me do differently?

https://stackoverflow.com/questions/20109391/how-to-make-good-reproducible-pandas-examples — Paul H, Jul 25 '22 at 16:42

INGl0R1AM0R1 · Accepted Answer · 2022-07-25T16:57:39.703

0

A better approach would be something like this

malaria_data.groupby('CountryName')['IncidenceOfMalaria'].apply(lambda gp : gp.fillna(gp.min())

Will probably give you what you want, i didnt test it out because there is no sample data but please tell me if an error occurs.

edited Jul 25 '22 at 16:57

answered Jul 25 '22 at 16:27

INGl0R1AM0R1

Thanks for attempting the questions. You can get the data set from here https://github.com/carlw0194/Data-Analysis-Projects/blob/main/DatasetAfricaMalaria.csv. Also there return type of code above is a tuple. The desired return type is a series that can be used to substitute an entire column in the data set. – Philosophia Jul 25 '22 at 16:48
Why dont you try it again buddy i reedited i am pretty sure nwo is gonna return a series. – INGl0R1AM0R1 Jul 25 '22 at 16:59
BABOOOYEEEEEEEEEEEEEEEEEEEEEEEEEE your welcome – INGl0R1AM0R1 Jul 25 '22 at 17:14