I have a large database of housing data and I need to fill in the missing values by mean of the same class. For example, in the column "Bedrooms" the missing data needs to be filled by the mean bedrooms of houses with the same/similar size and price. The sizes are stored in sq. ft. in an attribute called "Area". There are a lot of different values for the Area and price attributes so I'm a little confused about how to approach this. Is there a simple way to do this in python? Also, is combining areas into intervals to have less distinct values and finding the mean for each interval more suitable?
Here is the sample data:
location bedrooms Size(sq. ft.) price
abc 7 4500 5.5 Crore
cde 6 2250 2.1 Crore
bda 7 4500 4.75 Crore
abc NA 4500 4.5 Crore
abc 5 2250 2.3 Crore
bda NA 1350 54 Lakh
cde 5 1575 1.6 Crore
bda NA 2452 3.25 Crore
bda 3 1260 95 Lakh
cde 6 2250 2.15 Crore
abc 8 4500 3.5 Crore