I have dataFrame named data2 which consist 583 observation and 11 variables. there are outliers available in data. I want to impute outliers of my 3 variables named a,b and c. All are of int64 type. using IQR and mean imputation technique.I created two variable from my data2 Q1 and Q3.
Q1 = data2[['a','b','c']].quantile(0.25)
Q3 = data2[['a','b','c']].quantile(0.75)
IQR = Q3 - Q1
print (IQR)
Then I've defined two more variables i.e. lower_limit and upper_limit.
lower_limit = Q1 - 1.5 * IQR
upper_limit = Q3 + 1.5 * IQR
Then I find mean values of a, b, and c.
mean_value = data2[['a','b','c']].mean()
print(mean_value)
Then I've Created one Function.
def imputer(value):
if value < lower_limit or value > upper_limit:
return mean_value
else:
return value
Now when I want to put values into dataframe using impute function which I have created earlier.
results = data2[['a','b','c']].apply(imputer) #Error Line
It Shows me error saying ValueError : 'Can only compare identically-labeled Series objects.
Anyones help is appreciated.